New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 706606 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Apr 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 2
Type: Bug



Sign in to add a comment

TranslateExtension has inaccurate LangID predictions for short segments

Project Member Reported by riesa@chromium.org, Mar 29 2017

Issue description

Background: The Translate Extension allows users to highlight text and get an on-the-fly translation without having to copy-and-paste into the translate.google.com website frontend.

Issue: In order to translate, the extension must determine the source language of the text the user has highlighted. It does this by calling the Chrome LangID model on that text. However, the internal Chrome LangID model, CLD3, is optimized for documents and known to be inaccurate for very short text segments. 

Solution: We will set the "is_reliable" field of the LangID response to false for inputs shorter than a certain threshold measured in bytes. When is_reliable is set to false, the extension attempts to gather surrounding context on the page and retry LangID.
 

Comment 1 Deleted

Comment 2 by riesa@chromium.org, Apr 3 2017

Setting the minimum-reliable threshold to 50 bytes will, based on our development ata, lead to an expected F1 of .90 on average.

input size vs accuracy LangID.pdf
36.3 KB Download

Comment 3 by riesa@chromium.org, Apr 3 2017

Labels: -Restrict-View-Google

Comment 4 by riesa@chromium.org, Apr 3 2017

Labels: Merge-Request-58
Project Member

Comment 5 by bugdroid1@chromium.org, Apr 3 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/88953d9c7663f797a6c92b8b764812d0e84ef8d9

commit 88953d9c7663f797a6c92b8b764812d0e84ef8d9
Author: riesa <riesa@chromium.org>
Date: Mon Apr 03 22:34:47 2017

Sets is_reliable for CLD3 if below a minimum byte threshold of 50 bytes.

BUG= 706606 

Review-Url: https://codereview.chromium.org/2780323002
Cr-Commit-Position: refs/heads/master@{#461560}

[modify] https://crrev.com/88953d9c7663f797a6c92b8b764812d0e84ef8d9/extensions/renderer/i18n_custom_bindings.cc

Comment 6 by riesa@chromium.org, Apr 4 2017

Labels: -Merge-Request-58

Comment 7 by riesa@chromium.org, Apr 4 2017

Labels: Merge-Request-58
Project Member

Comment 8 by sheriffbot@chromium.org, Apr 4 2017

Labels: -Merge-Request-58 Hotlist-Merge-Approved Merge-Approved-58
Your change meets the bar and is auto-approved for M58. Please go ahead and merge the CL to branch 3029 manually. Please contact milestone owner if you have questions.
Owners: amineer@(Android), cmasso@(iOS), bhthompson@(ChromeOS), govind@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Project Member

Comment 9 by bugdroid1@chromium.org, Apr 5 2017

Labels: -merge-approved-58 merge-merged-3029
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e822d69494b4a297e8ce1ff17e88f806488071cb

commit e822d69494b4a297e8ce1ff17e88f806488071cb
Author: Rouslan Solomakhin <rouslan@chromium.org>
Date: Wed Apr 05 13:31:47 2017

[Merge M-58] Sets is_reliable for CLD3 if below a minimum byte threshold of 50 bytes.

BUG= 706606 

Review-Url: https://codereview.chromium.org/2780323002
Cr-Commit-Position: refs/heads/master@{#461560}
(cherry picked from commit 88953d9c7663f797a6c92b8b764812d0e84ef8d9)

Review-Url: https://codereview.chromium.org/2799643002 .
Cr-Commit-Position: refs/branch-heads/3029@{#587}
Cr-Branched-From: 939b32ee5ba05c396eef3fd992822fcca9a2e262-refs/heads/master@{#454471}

[modify] https://crrev.com/e822d69494b4a297e8ce1ff17e88f806488071cb/extensions/renderer/i18n_custom_bindings.cc

Comment 10 by riesa@chromium.org, Apr 10 2017

Status: Fixed (was: Assigned)

Sign in to add a comment