New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 679871 link

Starred by 6 users

Issue metadata

Status: Assigned
Owner:
Last visit > 30 days ago
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: 3
Type: Bug



Sign in to add a comment

Not being prompted for translation on Chrome Linux

Project Member Reported by k...@chromium.org, Jan 10 2017

Issue description

Chrome Version: 55.0.2883.87
OS: Linux

What steps will reproduce the problem?
(1) Open baidu.com

What is the expected result?

See prompt for translation.

What happens instead?

See no prompt for translation (see screenshot)
 
Screenshot from 2017-01-10 14:13:25.png
52.8 KB View Download
Cc: groby@chromium.org

Comment 2 by groby@chromium.org, Jan 11 2017

Owner: abakalov@chromium.org
Status: Assigned (was: Untriaged)
Just repro'ed it myself. Language detection logs attached. Basically the language detector classifies baidu.com as "not detectable". That's likely a known issue in CLD3.

abakalov@ is aware of the root cause (char/unsigned char) and working on a fix. 
translate_internals_detect_logs_dump (3).json
749 bytes View Download
Cc: djweiss@chromium.org riesa@chromium.org
Thanks for reporting this. CLD2 and CLD3 process input by first splitting based on script and then making a prediction for each resulting substring independently. For the baidu.com case, this leads to having several short snippets in Chinese and English (e.g., "About Baidu"). Both CLD2 and CLD3 find short text challenging, so we decided to introduce a length threshold. All of the substrings in baidu.com’s case are below this threshold, so the model predicts "unknown".

There are ways of relaxing this constraint:
1) consider the predictions for short pieces of text as well if the probability is above a strict threshold
2) extend the model to capture the case that if we have input text containing the strings X, Y, and Z (where, for example, X and Z are in Hani script and Y is a very short text in Latin script), then most likely X and Z are in the same language and Y is a name (e.g., "Baidu").

To be on the safe side, I personally prefer to make the CLD2/CLD3 switch with the current version of the model, and address cases challenging to both CLD2 and CLD3 in the following release. Please let me know if you think otherwise.
Labels: Hotlist-CLD3
Components: UI>Browser>Translate
Components: -UI>Browser>Translate UI>Browser>Language>Translate
Cc: abakalov@chromium.org rbasuvula@chromium.org ajha@chromium.org kavvaru@chromium.org brajkumar@chromium.org
 Issue 732364  has been merged into this issue.
Cc: nyerramilli@chromium.org
 Issue 837518  has been merged into this issue.
Issue 817119 has been merged into this issue.

Sign in to add a comment