New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 634961 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Apr 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: All , Mac
Pri: 2
Type: Feature



Sign in to add a comment

Add UMA histograms for translation decision outcomes

Project Member Reported by rogerm@chromium.org, Aug 5 2016

Issue description

We lack insight into the usage of the translate features by src and dest languages.

Add histograms to track each of the translation outcomes:

* Ignored the prompt
* Accepted Once
* Accepted for Site
* Accepted for Language
* Declined Once
* Declined for Site
* Declined for Language

Bucket the histograms by {src, dst} language pair.
 
Cc: hamelphi@chromium.org

Comment 2 by zkoch@chromium.org, Aug 5 2016

This seems like...a *lot* of histograms, no?

Comment 3 by groby@chromium.org, Aug 5 2016

That is a *lot* of histograms. What is the reason we want to break this up by src/dst pair? What particular goal are we trying to achieve?

For the dst part, you could slice the data by geolocation, which is a reasonable approximation. That leaves by src, which we already have.

Alternatively, we'll either need support for sparse histograms in UMA, or we'll need to multiply this out and then break it up in a postprocessing step. 

(I.e. compute a histogram key as something like {src-index * 200 + dst-index } * 10 + choice-index. )


Re: # of histograms

The originally planned approach was to do as you suggest: encode the event
info into a sparse histogram, exclude it from histograms.xml, and use
dremel to custom slice and dice the data.

It's a fairly small change to go back to that.

After chatting with the Finch folks (asvitkine, jwd and rkaplow) a while
back , they suggested that the existing tools and visualizations wouldn't
handle the breakdown of the 3 dimensions well, but should work OK for
src/dest pairs (2 dins) split into separate metrics (the 3rd dim).

I'm fine with encoding it all into one histogram; it just means building
additional queries and visualizations to actually see the data instead of
using those already built by the Metrics team.

Re: src/dest

So there are two issues this metricis intended to help us address:

1. DPM is too high in some regions

Dest as an approximation works well for much of the developed world, yes;
but, for the next 1B, dest is actually a really poor approximation. In
particular, on many regions the dest for currently offered translations is
English because that's the UI language by default.

Similarly, there are multilingual regions where several languages are used
in similar proportions.

Also, users can pick the src (if we get it wrong) and dest (if they don't
want the ui language), which we otherwise can't capture.

2. Being smarter about choosing the dest language.

We're not (yet) offering translations away from the ui language even though
we know that some geos have a substantial populations that use their device
in a language in which they are not fluent.

This dataset is also intended to be a learning source for experiments with
smarter dest language proposals.




On Aug 5, 2016 7:10 PM, "groby via monorail" <monorail+v2.2277968226@
chromium.org> wrote:
Cc: abakalov@chromium.org
+abakalov@ for completeness as he is working on a bunch of CLD changes.
groby@ & zkoch@...

ping?

Comment 7 by zkoch@chromium.org, Aug 9 2016

So for me, the key question here is how frequently are we going to explore all of those different language pairs inside of the current metrics UI. My hunch is...not much. That most of the value for this is to inform the model that we're trying to build. If that's the case, it seems that the single histogram that is dremelable is the way to go. Thoughts?

Comment 8 by rogerm@chromium.org, Aug 10 2016

ok, I'll collapse it to a single histogram.

Comment 9 by groby@chromium.org, Aug 13 2016

"DPM is too high in some regions" - sorry, what is DPM?

"on many regions the dest for currently offered translations is English because that's the UI language by default." - how is dest useful in the histograms, then? (Also, translate is working on improving that)

"Also, users can pick the src (if we get it wrong) and dest (if they don't
want the ui language), which we otherwise can't capture."
Does this happen significantly often to be of concern?

We *do* offering translations away from UI language. There's a finch trial in Indonesia/Malaysia going on, and we're currently hooking up ULP to determine a user's language.

"This dataset is also intended to be a learning source for experiments with
smarter dest language proposals." 
Is this coordinated with e.g. the ULP efforts, and several other efforts around dest language?




Decisions per Mille

The number of time the user of interrupted to make a decision (like, would
you like to translate this page) per thousand navigations.

"on many regions the dest for currently offered translations is English

Dest is useful precisely because we want to experiment with offering other
languages.

AFAIK, we have no idea, because we don't have metrics.

ULP, to the best of my knowledge, only supports signed in users.

There are other efforts to consider local browsing history, region, device
settings, etc for non-signed in users to determine the best language to (a)
offer as a translation target and (b) offer suggested content (Zine, for
example).

"This dataset is also intended to be a learning source for experiments with

This is coordination with the Chrome Machine Intelligence and DPM efforts
around language.
Hi Rachel, Zack.

Speaking just on the metrics side, we think the solution that Roger has described makes sense, one histogram per outcome, with buckets of [src-dest]. This keeps the absolute number of histograms low < 10, and he can use a sparse histogram for this.

This will still not work that well with the internal tools, since the number of buckets will be very large, but in this case there will need to be custom analysis anyway. The reason to do this over putting all 3 dimensions in one giant histogram is that giant histogram will be much more useless, while this version should still be able to be used reasonably well with the internal tools. The cost to both clients and our pipelines will be roughly equivalent in the two, so I prefer the split version which will be much more understandable.

For why destination is needed, I don't think that looking at geolocation data is a good enough proxy for what the MI team is trying to achieve.

Comment 12 by zkoch@chromium.org, Aug 22 2016

I defer to you all on what is best! I don't have a super strong opinion here.
Components: -Metrics Internals>Metrics
In working the the translate, metrics and Chrome MI teams, we've reached a consensus to log a new custom proto to UMA and captures all of the details described above, plus some more needed by Chrome MI and desired by translate.

See: https://codereview.chromium.org/2394643002/
Project Member

Comment 17 by bugdroid1@chromium.org, Nov 16 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce

commit 2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce
Author: rogerm <rogerm@chromium.org>
Date: Wed Nov 16 07:17:05 2016

Integrate TranslateEventProto UMA logging into TranslateManager.

This CL integrates the TranslateEventProto into the TranaslateManager in order to report these events via UMA.

BUG= 634961 ,  653700 

Review-Url: https://codereview.chromium.org/2400503002
Cr-Commit-Position: refs/heads/master@{#432416}

[modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/translate/chrome_translate_client.cc
[modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/translate/chrome_translate_client.h
[modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/translate/translate_manager_render_view_host_unittest.cc
[modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/ui/browser_commands.cc
[modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/ui/browser_window.h
[modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/ui/cocoa/browser_window_cocoa.h
[modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/ui/cocoa/browser_window_cocoa.mm
[modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/ui/translate/translate_bubble_factory.cc
[modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/ui/translate/translate_bubble_factory.h
[modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/ui/translate/translate_bubble_view_state_transition.h
[modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/ui/views/frame/browser_view.cc
[modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/ui/views/frame/browser_view.h
[modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/test/base/test_browser_window.cc
[modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/test/base/test_browser_window.h
[modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/components/metrics/proto/translate_event.proto
[modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/components/translate/core/browser/translate_manager.cc
[modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/components/translate/core/browser/translate_manager.h
[modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/components/translate/core/browser/translate_ranker.cc
[modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/components/translate/core/browser/translate_ranker.h
[modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/components/translate/core/browser/translate_ui_delegate.cc
[modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/tools/metrics/histograms/histograms.xml

 @rogerm, @hamelphi - should this be marked as fixed?
Status: Fixed (was: Unknown)
Components: -UI>Browser>Translate UI>Browser>Language>Translate

Sign in to add a comment