New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 732893 link

Starred by 2 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 3
Type: Bug

Blocking:
issue 725654



Sign in to add a comment

Wrong encoding is used when TextResource is reused via MemoryCache for multiple encodings

Project Member Reported by hirosh...@chromium.org, Jun 13 2017

Issue description

TextResource has its encoding and different encodings result in different text content.
However, MemoryCache doesn't check the encoding when reusing TextResource, so:
1. Request 1 with encoding A is sent. TextResource 1 with encoding A is created and added to MemoryCache.
2. Request 2 with encoding B is sent. MemoryCache returns TextResource 1 for Request 2, while the encodings are different. Thus Request 2 gets the text with encoding A, which is wrong.

Test (Script):
external/wpt/html/semantics/scripting-1/the-script-element/script-charset-02.html (currently has -expected.txt which contains FAIL)

 
Draft CL: https://chromium-review.googlesource.com/c/533896/
But this is not completely correct.

Also, how about cases where encoding is updated? Encoding detection like TextResourceDecoder::CheckForBOM()?
Blocking: 725654
Project Member

Comment 3 by bugdroid1@chromium.org, Jun 20 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/5ceed48c3549686d68ade225a1dc3becfb158427

commit 5ceed48c3549686d68ade225a1dc3becfb158427
Author: Hiroshige Hayashizaki <hiroshige@chromium.org>
Date: Tue Jun 20 22:07:02 2017

Make TextResourceDecoder::Create() take ContentType instead of mime type

The |mime_type| argument is used only to determine ContentType, and
set to a constant except when called from TextResourceDecoderBuilder.

This CL makes Create() to take ContentType directly as an argument
and moves DetermineContentType() to TextResourceDecoderBuilder.cpp.

Bug:  725654 , 732893
Change-Id: I582727705351a49d10ecffdfdd9c1578114b6321
Reviewed-on: https://chromium-review.googlesource.com/539957
Commit-Queue: Hiroshige Hayashizaki <hiroshige@chromium.org>
Reviewed-by: Kouhei Ueno <kouhei@chromium.org>
Reviewed-by: Takeshi Yoshino <tyoshino@chromium.org>
Reviewed-by: Kinuko Yasuda <kinuko@chromium.org>
Reviewed-by: Charlie Harrison <csharrison@chromium.org>
Cr-Commit-Position: refs/heads/master@{#480982}
[modify] https://crrev.com/5ceed48c3549686d68ade225a1dc3becfb158427/third_party/WebKit/Source/bindings/core/v8/ScriptStreamer.cpp
[modify] https://crrev.com/5ceed48c3549686d68ade225a1dc3becfb158427/third_party/WebKit/Source/core/fileapi/FileReaderLoader.cpp
[modify] https://crrev.com/5ceed48c3549686d68ade225a1dc3becfb158427/third_party/WebKit/Source/core/html/parser/TextResourceDecoder.cpp
[modify] https://crrev.com/5ceed48c3549686d68ade225a1dc3becfb158427/third_party/WebKit/Source/core/html/parser/TextResourceDecoder.h
[modify] https://crrev.com/5ceed48c3549686d68ade225a1dc3becfb158427/third_party/WebKit/Source/core/html/parser/TextResourceDecoderForFuzzing.h
[modify] https://crrev.com/5ceed48c3549686d68ade225a1dc3becfb158427/third_party/WebKit/Source/core/html/parser/TextResourceDecoderTest.cpp
[modify] https://crrev.com/5ceed48c3549686d68ade225a1dc3becfb158427/third_party/WebKit/Source/core/html/track/vtt/VTTParser.cpp
[modify] https://crrev.com/5ceed48c3549686d68ade225a1dc3becfb158427/third_party/WebKit/Source/core/inspector/InspectorPageAgent.cpp
[modify] https://crrev.com/5ceed48c3549686d68ade225a1dc3becfb158427/third_party/WebKit/Source/core/loader/TextResourceDecoderBuilder.cpp
[modify] https://crrev.com/5ceed48c3549686d68ade225a1dc3becfb158427/third_party/WebKit/Source/core/loader/resource/CSSStyleSheetResource.cpp
[modify] https://crrev.com/5ceed48c3549686d68ade225a1dc3becfb158427/third_party/WebKit/Source/core/loader/resource/DocumentResource.cpp
[modify] https://crrev.com/5ceed48c3549686d68ade225a1dc3becfb158427/third_party/WebKit/Source/core/loader/resource/ScriptResource.cpp
[modify] https://crrev.com/5ceed48c3549686d68ade225a1dc3becfb158427/third_party/WebKit/Source/core/loader/resource/StyleSheetResource.h
[modify] https://crrev.com/5ceed48c3549686d68ade225a1dc3becfb158427/third_party/WebKit/Source/core/loader/resource/TextResource.cpp
[modify] https://crrev.com/5ceed48c3549686d68ade225a1dc3becfb158427/third_party/WebKit/Source/core/loader/resource/TextResource.h
[modify] https://crrev.com/5ceed48c3549686d68ade225a1dc3becfb158427/third_party/WebKit/Source/core/loader/resource/XSLStyleSheetResource.cpp
[modify] https://crrev.com/5ceed48c3549686d68ade225a1dc3becfb158427/third_party/WebKit/Source/core/workers/WorkerScriptLoader.cpp
[modify] https://crrev.com/5ceed48c3549686d68ade225a1dc3becfb158427/third_party/WebKit/Source/core/xmlhttprequest/XMLHttpRequest.cpp
[modify] https://crrev.com/5ceed48c3549686d68ade225a1dc3becfb158427/third_party/WebKit/Source/platform/testing/FuzzedDataProvider.cpp
[modify] https://crrev.com/5ceed48c3549686d68ade225a1dc3becfb158427/third_party/WebKit/Source/platform/testing/FuzzedDataProvider.h

Cc: y...@yoav.ws
Components: -Blink>HTML>Script
Hmm, draft CL https://chromium-review.googlesource.com/c/533896/
is failing on trybots because of encoding mismatch related to LinkLoader and preloading via Link HTTP headers.

The encoding is set to document.Encoding() by LinkLoader:
https://codesearch.chromium.org/chromium/src/third_party/WebKit/Source/core/loader/LinkLoader.cpp?l=357
> link_fetch_params.SetCharset(document.Encoding());

But at this time, document.Encoding() is UTF-8, at least in some tests (because this is before processing the document content itself, which might contain <meta> tags etc. that can set the document encoding), and this can cause encoding mismatch in cases like:
(Link header for foo.js) -> use UTF-8 encoding
<html>
// encoding of this HTML is non-UTF-8
<script src="foo.js"> -> use non-UTF-8 encoding, causing encoding mismatch (current ToT) or loading foo.js twice (my CL).
</html>

Probably this is usually not an observable problem, because most of scripts are ASCII and can be decoded correctly as UTF-8 and most of encodings (e.g. Latin1, Shift_JIS, etc.) used for main documents.

The problem will be more visible if we forbid reusing Resource if the encodings of the loaded TextResource and the request are different, even when the actual text is ASCII-only and the two encodings are compatible with ASCII -- causing loading the Resource again (hopefully hitting the disk cache).
(Strictly speaking, the Resource can't be naively shared, if pending Resource or revalidation is considered)
(This situation might be common and not limited to the LinkLoader case above, and thus probably I have to measure the impact.)

yoav@, WDYT?
Project Member

Comment 5 by bugdroid1@chromium.org, Jun 21 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/b57b5b36e33de7d15c06615156e00a45c79a4c13

commit b57b5b36e33de7d15c06615156e00a45c79a4c13
Author: Hiroshige Hayashizaki <hiroshige@chromium.org>
Date: Wed Jun 21 01:06:24 2017

Refactor code around TextResourceDecoder::ShouldAutoDetect()

ShouldAutoDetect(), DetectTextEncoding() and SetEncoding() calls and
references to hint_* members are always coupled.
This CL moves these code to AutoDetectEncodingIfAllowed() to make
it clear that these are always coupled and executed only for
|kUseAllAutoDetection|.

Bug:  725654 , 732893
Change-Id: Ia5a295bb2f600a4566f28738d421266ebb900343
Reviewed-on: https://chromium-review.googlesource.com/540179
Commit-Queue: Hiroshige Hayashizaki <hiroshige@chromium.org>
Reviewed-by: Kouhei Ueno <kouhei@chromium.org>
Cr-Commit-Position: refs/heads/master@{#481055}
[modify] https://crrev.com/b57b5b36e33de7d15c06615156e00a45c79a4c13/third_party/WebKit/Source/core/html/parser/TextResourceDecoder.cpp
[modify] https://crrev.com/b57b5b36e33de7d15c06615156e00a45c79a4c13/third_party/WebKit/Source/core/html/parser/TextResourceDecoder.h

Project Member

Comment 6 by bugdroid1@chromium.org, Jun 21 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/ee353afe4b953401e505186f45e71e066592382a

commit ee353afe4b953401e505186f45e71e066592382a
Author: Hiroshige Hayashizaki <hiroshige@chromium.org>
Date: Wed Jun 21 06:21:40 2017

Introduce TextResourceDecoderOptions

This CL introduces TextResourceDecoderOptions and moves some members
and enums of TextResourceDecoder to TextResourceDecoderOptions,
so that the same TextResourceDecoderOptions result in the same
decoding result, including automatic encoding detection.

This is preparation for Issue 732893, where I'll stop reusing
Resource if TextResourceDecoderOptions is different.

Bug:  725654 , 732893
Change-Id: I9044b47f5d89f69a493e49036aa60cf32979f815
Reviewed-on: https://chromium-review.googlesource.com/540375
Commit-Queue: Hiroshige Hayashizaki <hiroshige@chromium.org>
Reviewed-by: Kouhei Ueno <kouhei@chromium.org>
Reviewed-by: Kinuko Yasuda <kinuko@chromium.org>
Reviewed-by: Takeshi Yoshino <tyoshino@chromium.org>
Cr-Commit-Position: refs/heads/master@{#481122}
[modify] https://crrev.com/ee353afe4b953401e505186f45e71e066592382a/third_party/WebKit/Source/bindings/core/v8/ScriptStreamer.cpp
[modify] https://crrev.com/ee353afe4b953401e505186f45e71e066592382a/third_party/WebKit/Source/core/fileapi/FileReaderLoader.cpp
[modify] https://crrev.com/ee353afe4b953401e505186f45e71e066592382a/third_party/WebKit/Source/core/html/parser/TextResourceDecoder.cpp
[modify] https://crrev.com/ee353afe4b953401e505186f45e71e066592382a/third_party/WebKit/Source/core/html/parser/TextResourceDecoder.h
[modify] https://crrev.com/ee353afe4b953401e505186f45e71e066592382a/third_party/WebKit/Source/core/html/parser/TextResourceDecoderForFuzzing.h
[modify] https://crrev.com/ee353afe4b953401e505186f45e71e066592382a/third_party/WebKit/Source/core/html/parser/TextResourceDecoderTest.cpp
[modify] https://crrev.com/ee353afe4b953401e505186f45e71e066592382a/third_party/WebKit/Source/core/html/track/vtt/VTTParser.cpp
[modify] https://crrev.com/ee353afe4b953401e505186f45e71e066592382a/third_party/WebKit/Source/core/inspector/InspectorPageAgent.cpp
[modify] https://crrev.com/ee353afe4b953401e505186f45e71e066592382a/third_party/WebKit/Source/core/loader/TextResourceDecoderBuilder.cpp
[modify] https://crrev.com/ee353afe4b953401e505186f45e71e066592382a/third_party/WebKit/Source/core/loader/TextResourceDecoderBuilder.h
[modify] https://crrev.com/ee353afe4b953401e505186f45e71e066592382a/third_party/WebKit/Source/core/loader/resource/CSSStyleSheetResource.cpp
[modify] https://crrev.com/ee353afe4b953401e505186f45e71e066592382a/third_party/WebKit/Source/core/loader/resource/DocumentResource.cpp
[modify] https://crrev.com/ee353afe4b953401e505186f45e71e066592382a/third_party/WebKit/Source/core/loader/resource/ScriptResource.cpp
[modify] https://crrev.com/ee353afe4b953401e505186f45e71e066592382a/third_party/WebKit/Source/core/loader/resource/StyleSheetResource.h
[modify] https://crrev.com/ee353afe4b953401e505186f45e71e066592382a/third_party/WebKit/Source/core/loader/resource/TextResource.cpp
[modify] https://crrev.com/ee353afe4b953401e505186f45e71e066592382a/third_party/WebKit/Source/core/loader/resource/TextResource.h
[modify] https://crrev.com/ee353afe4b953401e505186f45e71e066592382a/third_party/WebKit/Source/core/loader/resource/XSLStyleSheetResource.cpp
[modify] https://crrev.com/ee353afe4b953401e505186f45e71e066592382a/third_party/WebKit/Source/core/workers/WorkerScriptLoader.cpp
[modify] https://crrev.com/ee353afe4b953401e505186f45e71e066592382a/third_party/WebKit/Source/core/xmlhttprequest/XMLHttpRequest.cpp
[modify] https://crrev.com/ee353afe4b953401e505186f45e71e066592382a/third_party/WebKit/Source/modules/fetch/FetchDataLoader.cpp
[modify] https://crrev.com/ee353afe4b953401e505186f45e71e066592382a/third_party/WebKit/Source/platform/loader/BUILD.gn
[add] https://crrev.com/ee353afe4b953401e505186f45e71e066592382a/third_party/WebKit/Source/platform/loader/fetch/TextResourceDecoderOptions.cpp
[add] https://crrev.com/ee353afe4b953401e505186f45e71e066592382a/third_party/WebKit/Source/platform/loader/fetch/TextResourceDecoderOptions.h

Project Member

Comment 7 by bugdroid1@chromium.org, Jun 21 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/b17cd49a408ab1c8dfe642df07acab4f156f1514

commit b17cd49a408ab1c8dfe642df07acab4f156f1514
Author: Hiroshige Hayashizaki <hiroshige@chromium.org>
Date: Wed Jun 21 10:35:32 2017

Introduce NonTextResourceFactory

This is to simplify upcoming changes to charset handling.
NonTextResourceFactory::Create() and its overrides now don't have
charset argument and thus isn't needed to be modified by the
upcoming changes.

Bug:  725654 , 732893
Change-Id: I94104979bb65bdf65f98071343d9930d0a39fc1a
Reviewed-on: https://chromium-review.googlesource.com/540696
Commit-Queue: Hiroshige Hayashizaki <hiroshige@chromium.org>
Reviewed-by: Takeshi Yoshino <tyoshino@chromium.org>
Reviewed-by: Kouhei Ueno <kouhei@chromium.org>
Reviewed-by: Yutaka Hirano <yhirano@chromium.org>
Cr-Commit-Position: refs/heads/master@{#481166}
[modify] https://crrev.com/b17cd49a408ab1c8dfe642df07acab4f156f1514/third_party/WebKit/Source/core/loader/resource/FontResource.h
[modify] https://crrev.com/b17cd49a408ab1c8dfe642df07acab4f156f1514/third_party/WebKit/Source/core/loader/resource/ImageResource.cpp
[modify] https://crrev.com/b17cd49a408ab1c8dfe642df07acab4f156f1514/third_party/WebKit/Source/core/loader/resource/LinkFetchResource.h
[modify] https://crrev.com/b17cd49a408ab1c8dfe642df07acab4f156f1514/third_party/WebKit/Source/platform/loader/fetch/RawResource.h
[modify] https://crrev.com/b17cd49a408ab1c8dfe642df07acab4f156f1514/third_party/WebKit/Source/platform/loader/fetch/Resource.h
[modify] https://crrev.com/b17cd49a408ab1c8dfe642df07acab4f156f1514/third_party/WebKit/Source/platform/loader/testing/MockResource.cpp

Comment 8 by y...@yoav.ws, Jun 21 2017

I've hit similar issues in WebKit, which is not reusing MemoryCached resources if charset modifications are detected. Unfortunately I'm traveling this week, but would be happy to chat about this and the solution design next week.
Project Member

Comment 9 by bugdroid1@chromium.org, Jun 22 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/3987be28dc176cb9be540f2eb2202db2dca1b00a

commit 3987be28dc176cb9be540f2eb2202db2dca1b00a
Author: Hiroshige Hayashizaki <hiroshige@chromium.org>
Date: Thu Jun 22 00:14:57 2017

Use TextResourceDecoderOptions in FetchParameters and ResourceFactory

Previously, FetchParameters::Charset() was String, but to support UTF-8
enforcement in module scripts, the existing "UTF-8" requests (that can be
overridden by BOMs or HTTP headers) and enforced "UTF-8" requests must be
distinguished.

This CL uses TextResourceDecoderOptions (that can distinguish these two cases)
instead of String in ResourceFactory::Create() and FetchParameters.
The ContentType in TextResourceDecoderOptions in FetchParameters is
|kPlainTextContent| and is later overridden by ResourceFactory::ContentType()
in ResourceFetcher::RequestResource() before actual use.

As preparation for this CL, the preceding CLs replaced String that represents
an encoding with WTF::TextEncoding and introduced TextResourceDecoderOptions.

Bug:  725654 , 732893
Change-Id: I87e036059df08ce9f42779e2ce2fd694507e7571
Reviewed-on: https://chromium-review.googlesource.com/535983
Commit-Queue: Hiroshige Hayashizaki <hiroshige@chromium.org>
Reviewed-by: Takeshi Yoshino <tyoshino@chromium.org>
Reviewed-by: Kouhei Ueno <kouhei@chromium.org>
Reviewed-by: Kinuko Yasuda <kinuko@chromium.org>
Cr-Commit-Position: refs/heads/master@{#481368}
[modify] https://crrev.com/3987be28dc176cb9be540f2eb2202db2dca1b00a/third_party/WebKit/Source/bindings/core/v8/ScriptStreamerTest.cpp
[modify] https://crrev.com/3987be28dc176cb9be540f2eb2202db2dca1b00a/third_party/WebKit/Source/bindings/core/v8/V8ScriptRunnerTest.cpp
[modify] https://crrev.com/3987be28dc176cb9be540f2eb2202db2dca1b00a/third_party/WebKit/Source/core/html/parser/CSSPreloadScannerTest.cpp
[modify] https://crrev.com/3987be28dc176cb9be540f2eb2202db2dca1b00a/third_party/WebKit/Source/core/loader/resource/CSSStyleSheetResource.cpp
[modify] https://crrev.com/3987be28dc176cb9be540f2eb2202db2dca1b00a/third_party/WebKit/Source/core/loader/resource/CSSStyleSheetResource.h
[modify] https://crrev.com/3987be28dc176cb9be540f2eb2202db2dca1b00a/third_party/WebKit/Source/core/loader/resource/CSSStyleSheetResourceTest.cpp
[modify] https://crrev.com/3987be28dc176cb9be540f2eb2202db2dca1b00a/third_party/WebKit/Source/core/loader/resource/DocumentResource.cpp
[modify] https://crrev.com/3987be28dc176cb9be540f2eb2202db2dca1b00a/third_party/WebKit/Source/core/loader/resource/DocumentResource.h
[modify] https://crrev.com/3987be28dc176cb9be540f2eb2202db2dca1b00a/third_party/WebKit/Source/core/loader/resource/ScriptResource.cpp
[modify] https://crrev.com/3987be28dc176cb9be540f2eb2202db2dca1b00a/third_party/WebKit/Source/core/loader/resource/ScriptResource.h
[modify] https://crrev.com/3987be28dc176cb9be540f2eb2202db2dca1b00a/third_party/WebKit/Source/core/loader/resource/StyleSheetResource.h
[modify] https://crrev.com/3987be28dc176cb9be540f2eb2202db2dca1b00a/third_party/WebKit/Source/core/loader/resource/TextResource.cpp
[modify] https://crrev.com/3987be28dc176cb9be540f2eb2202db2dca1b00a/third_party/WebKit/Source/core/loader/resource/TextResource.h
[modify] https://crrev.com/3987be28dc176cb9be540f2eb2202db2dca1b00a/third_party/WebKit/Source/core/loader/resource/XSLStyleSheetResource.cpp
[modify] https://crrev.com/3987be28dc176cb9be540f2eb2202db2dca1b00a/third_party/WebKit/Source/core/loader/resource/XSLStyleSheetResource.h
[modify] https://crrev.com/3987be28dc176cb9be540f2eb2202db2dca1b00a/third_party/WebKit/Source/platform/loader/fetch/FetchParameters.cpp
[modify] https://crrev.com/3987be28dc176cb9be540f2eb2202db2dca1b00a/third_party/WebKit/Source/platform/loader/fetch/FetchParameters.h
[modify] https://crrev.com/3987be28dc176cb9be540f2eb2202db2dca1b00a/third_party/WebKit/Source/platform/loader/fetch/Resource.h
[modify] https://crrev.com/3987be28dc176cb9be540f2eb2202db2dca1b00a/third_party/WebKit/Source/platform/loader/fetch/ResourceFetcher.cpp
[modify] https://crrev.com/3987be28dc176cb9be540f2eb2202db2dca1b00a/third_party/WebKit/Source/platform/loader/fetch/ResourceFetcher.h
[modify] https://crrev.com/3987be28dc176cb9be540f2eb2202db2dca1b00a/third_party/WebKit/Source/platform/loader/fetch/TextResourceDecoderOptions.h

Sign in to add a comment