Chrome uses CLD3 to perform language detection on all webpages. This language detection is generally very accurate. Chrome will override the language detection if there is a language attribute on the HTML tag or a language specified in the HTTP content-language header.
However, both these signals are often incorrectly specified by the site/page, and in particular many non-English sites report the language as English (presumably based on default values from authoring tools, etc.) Currently a whitelist is used to determine if the detected language should always override the language attribute / content-language.
We should investigate if there is a way to always override the language attribute / content-language if there is a high confidence language detection result.
See discussion here: https://docs.google.com/document/d/12VPSMW1sq9DxX2UOZneMe6MKpjcOlkJ3KGgX4xXwkBg
Comment 1 by napper@chromium.org
, Oct 5 2017