Add UMA histograms for translation decision outcomes |
|||||
Issue description
We lack insight into the usage of the translate features by src and dest languages.
Add histograms to track each of the translation outcomes:
* Ignored the prompt
* Accepted Once
* Accepted for Site
* Accepted for Language
* Declined Once
* Declined for Site
* Declined for Language
Bucket the histograms by {src, dst} language pair.
,
Aug 5 2016
This seems like...a *lot* of histograms, no?
,
Aug 5 2016
That is a *lot* of histograms. What is the reason we want to break this up by src/dst pair? What particular goal are we trying to achieve?
For the dst part, you could slice the data by geolocation, which is a reasonable approximation. That leaves by src, which we already have.
Alternatively, we'll either need support for sparse histograms in UMA, or we'll need to multiply this out and then break it up in a postprocessing step.
(I.e. compute a histogram key as something like {src-index * 200 + dst-index } * 10 + choice-index. )
,
Aug 6 2016
Re: # of histograms The originally planned approach was to do as you suggest: encode the event info into a sparse histogram, exclude it from histograms.xml, and use dremel to custom slice and dice the data. It's a fairly small change to go back to that. After chatting with the Finch folks (asvitkine, jwd and rkaplow) a while back , they suggested that the existing tools and visualizations wouldn't handle the breakdown of the 3 dimensions well, but should work OK for src/dest pairs (2 dins) split into separate metrics (the 3rd dim). I'm fine with encoding it all into one histogram; it just means building additional queries and visualizations to actually see the data instead of using those already built by the Metrics team. Re: src/dest So there are two issues this metricis intended to help us address: 1. DPM is too high in some regions Dest as an approximation works well for much of the developed world, yes; but, for the next 1B, dest is actually a really poor approximation. In particular, on many regions the dest for currently offered translations is English because that's the UI language by default. Similarly, there are multilingual regions where several languages are used in similar proportions. Also, users can pick the src (if we get it wrong) and dest (if they don't want the ui language), which we otherwise can't capture. 2. Being smarter about choosing the dest language. We're not (yet) offering translations away from the ui language even though we know that some geos have a substantial populations that use their device in a language in which they are not fluent. This dataset is also intended to be a learning source for experiments with smarter dest language proposals. On Aug 5, 2016 7:10 PM, "groby via monorail" <monorail+v2.2277968226@ chromium.org> wrote:
,
Aug 8 2016
+abakalov@ for completeness as he is working on a bunch of CLD changes.
,
Aug 9 2016
groby@ & zkoch@... ping?
,
Aug 9 2016
So for me, the key question here is how frequently are we going to explore all of those different language pairs inside of the current metrics UI. My hunch is...not much. That most of the value for this is to inform the model that we're trying to build. If that's the case, it seems that the single histogram that is dremelable is the way to go. Thoughts?
,
Aug 10 2016
ok, I'll collapse it to a single histogram.
,
Aug 13 2016
"DPM is too high in some regions" - sorry, what is DPM? "on many regions the dest for currently offered translations is English because that's the UI language by default." - how is dest useful in the histograms, then? (Also, translate is working on improving that) "Also, users can pick the src (if we get it wrong) and dest (if they don't want the ui language), which we otherwise can't capture." Does this happen significantly often to be of concern? We *do* offering translations away from UI language. There's a finch trial in Indonesia/Malaysia going on, and we're currently hooking up ULP to determine a user's language. "This dataset is also intended to be a learning source for experiments with smarter dest language proposals." Is this coordinated with e.g. the ULP efforts, and several other efforts around dest language?
,
Aug 19 2016
Decisions per Mille The number of time the user of interrupted to make a decision (like, would you like to translate this page) per thousand navigations. "on many regions the dest for currently offered translations is English Dest is useful precisely because we want to experiment with offering other languages. AFAIK, we have no idea, because we don't have metrics. ULP, to the best of my knowledge, only supports signed in users. There are other efforts to consider local browsing history, region, device settings, etc for non-signed in users to determine the best language to (a) offer as a translation target and (b) offer suggested content (Zine, for example). "This dataset is also intended to be a learning source for experiments with This is coordination with the Chrome Machine Intelligence and DPM efforts around language.
,
Aug 19 2016
Hi Rachel, Zack. Speaking just on the metrics side, we think the solution that Roger has described makes sense, one histogram per outcome, with buckets of [src-dest]. This keeps the absolute number of histograms low < 10, and he can use a sparse histogram for this. This will still not work that well with the internal tools, since the number of buckets will be very large, but in this case there will need to be custom analysis anyway. The reason to do this over putting all 3 dimensions in one giant histogram is that giant histogram will be much more useless, while this version should still be able to be used reasonably well with the internal tools. The cost to both clients and our pipelines will be roughly equivalent in the two, so I prefer the split version which will be much more understandable. For why destination is needed, I don't think that looking at geolocation data is a good enough proxy for what the MI team is trying to achieve.
,
Aug 22 2016
I defer to you all on what is best! I don't have a super strong opinion here.
,
Sep 17 2016
,
Oct 7 2016
In working the the translate, metrics and Chrome MI teams, we've reached a consensus to log a new custom proto to UMA and captures all of the details described above, plus some more needed by Chrome MI and desired by translate. See: https://codereview.chromium.org/2394643002/
,
Oct 7 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/01e12a12485f430701ec9c47974e71b3ecc99563 commit 01e12a12485f430701ec9c47974e71b3ecc99563 Author: hamelphi <hamelphi@chromium.org> Date: Fri Oct 07 21:27:02 2016 Add TranslateEventProto. BUG= 653700 , 634961 Review-Url: https://codereview.chromium.org/2394643002 Cr-Commit-Position: refs/heads/master@{#423969} [modify] https://crrev.com/01e12a12485f430701ec9c47974e71b3ecc99563/components/metrics/proto/BUILD.gn [modify] https://crrev.com/01e12a12485f430701ec9c47974e71b3ecc99563/components/metrics/proto/chrome_user_metrics_extension.proto [add] https://crrev.com/01e12a12485f430701ec9c47974e71b3ecc99563/components/metrics/proto/translate_event.proto
,
Oct 12 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b257fe9351c22aaae756a8527442193ef412d6cf commit b257fe9351c22aaae756a8527442193ef412d6cf Author: hamelphi <hamelphi@chromium.org> Date: Wed Oct 12 22:07:53 2016 Send TranslateEventProtos to UMA Cache TranslateEvents and send them to UMA through the metrics provider. BUG= 653700 , 634961 Review-Url: https://codereview.chromium.org/2395253002 Cr-Commit-Position: refs/heads/master@{#424878} [modify] https://crrev.com/b257fe9351c22aaae756a8527442193ef412d6cf/components/translate/core/browser/BUILD.gn [modify] https://crrev.com/b257fe9351c22aaae756a8527442193ef412d6cf/components/translate/core/browser/translate_ranker.cc [modify] https://crrev.com/b257fe9351c22aaae756a8527442193ef412d6cf/components/translate/core/browser/translate_ranker.h [modify] https://crrev.com/b257fe9351c22aaae756a8527442193ef412d6cf/components/translate/core/browser/translate_ranker_metrics_provider.cc [modify] https://crrev.com/b257fe9351c22aaae756a8527442193ef412d6cf/components/translate/core/browser/translate_ranker_unittest.cc
,
Nov 16 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce commit 2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce Author: rogerm <rogerm@chromium.org> Date: Wed Nov 16 07:17:05 2016 Integrate TranslateEventProto UMA logging into TranslateManager. This CL integrates the TranslateEventProto into the TranaslateManager in order to report these events via UMA. BUG= 634961 , 653700 Review-Url: https://codereview.chromium.org/2400503002 Cr-Commit-Position: refs/heads/master@{#432416} [modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/translate/chrome_translate_client.cc [modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/translate/chrome_translate_client.h [modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/translate/translate_manager_render_view_host_unittest.cc [modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/ui/browser_commands.cc [modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/ui/browser_window.h [modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/ui/cocoa/browser_window_cocoa.h [modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/ui/cocoa/browser_window_cocoa.mm [modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/ui/translate/translate_bubble_factory.cc [modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/ui/translate/translate_bubble_factory.h [modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/ui/translate/translate_bubble_view_state_transition.h [modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/ui/views/frame/browser_view.cc [modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/browser/ui/views/frame/browser_view.h [modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/test/base/test_browser_window.cc [modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/chrome/test/base/test_browser_window.h [modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/components/metrics/proto/translate_event.proto [modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/components/translate/core/browser/translate_manager.cc [modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/components/translate/core/browser/translate_manager.h [modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/components/translate/core/browser/translate_ranker.cc [modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/components/translate/core/browser/translate_ranker.h [modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/components/translate/core/browser/translate_ui_delegate.cc [modify] https://crrev.com/2847cf85bbbd9dcfc7f9bcc3b6add85d709155ce/tools/metrics/histograms/histograms.xml
,
Apr 14 2017
@rogerm, @hamelphi - should this be marked as fixed?
,
Apr 14 2017
,
Apr 27 2017
|
|||||
►
Sign in to add a comment |
|||||
Comment 1 by rogerm@chromium.org
, Aug 5 2016