TranslateExtension has inaccurate LangID predictions for short segments |
||||||||
Issue descriptionBackground: The Translate Extension allows users to highlight text and get an on-the-fly translation without having to copy-and-paste into the translate.google.com website frontend. Issue: In order to translate, the extension must determine the source language of the text the user has highlighted. It does this by calling the Chrome LangID model on that text. However, the internal Chrome LangID model, CLD3, is optimized for documents and known to be inaccurate for very short text segments. Solution: We will set the "is_reliable" field of the LangID response to false for inputs shorter than a certain threshold measured in bytes. When is_reliable is set to false, the extension attempts to gather surrounding context on the page and retry LangID.
,
Apr 3 2017
Setting the minimum-reliable threshold to 50 bytes will, based on our development ata, lead to an expected F1 of .90 on average.
,
Apr 3 2017
,
Apr 3 2017
,
Apr 3 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/88953d9c7663f797a6c92b8b764812d0e84ef8d9 commit 88953d9c7663f797a6c92b8b764812d0e84ef8d9 Author: riesa <riesa@chromium.org> Date: Mon Apr 03 22:34:47 2017 Sets is_reliable for CLD3 if below a minimum byte threshold of 50 bytes. BUG= 706606 Review-Url: https://codereview.chromium.org/2780323002 Cr-Commit-Position: refs/heads/master@{#461560} [modify] https://crrev.com/88953d9c7663f797a6c92b8b764812d0e84ef8d9/extensions/renderer/i18n_custom_bindings.cc
,
Apr 4 2017
,
Apr 4 2017
,
Apr 4 2017
Your change meets the bar and is auto-approved for M58. Please go ahead and merge the CL to branch 3029 manually. Please contact milestone owner if you have questions. Owners: amineer@(Android), cmasso@(iOS), bhthompson@(ChromeOS), govind@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Apr 5 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e822d69494b4a297e8ce1ff17e88f806488071cb commit e822d69494b4a297e8ce1ff17e88f806488071cb Author: Rouslan Solomakhin <rouslan@chromium.org> Date: Wed Apr 05 13:31:47 2017 [Merge M-58] Sets is_reliable for CLD3 if below a minimum byte threshold of 50 bytes. BUG= 706606 Review-Url: https://codereview.chromium.org/2780323002 Cr-Commit-Position: refs/heads/master@{#461560} (cherry picked from commit 88953d9c7663f797a6c92b8b764812d0e84ef8d9) Review-Url: https://codereview.chromium.org/2799643002 . Cr-Commit-Position: refs/branch-heads/3029@{#587} Cr-Branched-From: 939b32ee5ba05c396eef3fd992822fcca9a2e262-refs/heads/master@{#454471} [modify] https://crrev.com/e822d69494b4a297e8ce1ff17e88f806488071cb/extensions/renderer/i18n_custom_bindings.cc
,
Apr 10 2017
|
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 Deleted