New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 771861 link

Starred by 2 users

Issue metadata

Status: Assigned
Merged: issue 765006
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Task



Sign in to add a comment

Investigate always using CLD3 detected language as page language

Project Member Reported by napper@chromium.org, Oct 5 2017

Issue description

Chrome uses CLD3 to perform language detection on all webpages. This language detection is generally very accurate. Chrome will override the language detection if there is a language attribute on the HTML tag or a language specified in the HTTP content-language header.

However, both these signals are often incorrectly specified by the site/page, and in particular many non-English sites report the language as English (presumably based on default values from authoring tools, etc.) Currently a whitelist is used to determine if the detected language should always override the language attribute / content-language.

We should investigate if there is a way to always override the language attribute / content-language if there is a high confidence language detection result.


See discussion here: https://docs.google.com/document/d/12VPSMW1sq9DxX2UOZneMe6MKpjcOlkJ3KGgX4xXwkBg

 
Cc: groby@chromium.org mdw@chromium.org
Owner: ----
Mergedinto: 765006
Status: Duplicate (was: Assigned)
Owner: anthonyvd@chromium.org
Status: Available (was: Duplicate)
Actually realized this was a bug for follow-up work
Cc: yyushkina@chromium.org
Status: Assigned (was: Available)
Cc: anthonyvd@chromium.org
Owner: frechette@chromium.org
+frechette@ who's looking at this.

Sign in to add a comment