Titles and thumbnails need to be from the best source |
|||||||||||||||
Issue descriptionCurrently the thumbnail and title we display for each card is a union from all the sources for the article. ChromeReader has pushed a change which lists this metadata per source so we should update the code on our side to use it
,
May 17 2016
,
May 17 2016
My understanding is that we want to select the best source in NTPSnippet::CreateFromDictionary and display the title and image associated with that source.
,
May 18 2016
From Raghu: "Instead of adding the title / images etc. to the SourceCorpusInfo field, I made internal changes to remove the bad (ie from some other source) title, images, snippet etc. So, it should work with your current code. In fact, this went live last week. So if you have observed bad metadata, that is a bug and it would be great to have a screenshot for debugging it." So he fixed this server side :-)
,
May 18 2016
Issue 610335 has been merged into this issue.
,
May 20 2016
Reopening this, as it's not actually fixed - if we have multiple sources, we will choose one of them as the "best", and we might still get title/thumbnail from another one. Setting to P3 though, until we actually see a problem caused by this.
,
May 20 2016
Mark, why do you think this is not a problem? Even if I have let's say Techcrunch and CNN in my list of most visited sites, I still don't want to see a snippet that mixes publisher attribution of one publisher with title/thumbnail from another. Am I missing something?
,
May 20 2016
I reopened the bug because I think it *is* a problem :) Just not a very common one: As you say, you need to have two or more sites in your list which have published an identical article. In addition, the problem will only be visible if the title or thumbnail actually contain publisher-specific info. Of course, feel free to re-prioritize!
,
May 20 2016
Yeah, I think even if it doesn't hit 90% of our users, we need to make sure this is correct.
,
May 20 2016
I'll get some more detail from Raghu about what he implemented.
,
May 24 2016
I'm back to push notifications for a few weeks, so marking this available again.
,
Jun 21 2016
,
Jun 27 2016
Michael, do you have any insights around this bug from Raghu?
,
Jun 28 2016
After offline chat with Michael and Tim, here is my summary of the situation: 1) I believe this bug is fixed. When ChromeReader returns multiple sources, these are now always from the same publisher. No mismatch should thus appear. 2) ChromeReader already makes effort to pick the best source for the title and thumbnail. I beleive we should not duplicate this work. We should instead pick the best source according to the choice of ChromeReader (We can do it based on the URLs of the suggestion and the URLs of the sources). 3) Limiting sources to the same publisher is not a fortunate choice for de-duplication across multiple fetches on the client (or later on our suggestions server). I propose following steps: - Select the source with the same URL as the article has as the best source. - Ask ChromeReader to list all sources as they did before. (This will not happen before M53 BP, we need to be careful and ask them to revert if it breaks further things)
,
Jun 28 2016
By experimenting I found out that ChromeReader sometimes provides a suggestion with a URL that is not in the corpus. I need to figure it out with Raghu.
,
Jun 28 2016
,
Jun 29 2016
After some clarification with Raghu: 2) above is wrong. 1) still holds. No action needed now, we have no evidence that this bug is not fixed. This issue also touches de-duplication and source quality (as ChromeReader limits the list of sources, we cannot de-duplicate so well and we may be forced to pick a sub-optimal source). We'll continue discussing with Raghu how to improve on that front.
,
Jul 1 2016
,
Jul 1 2016
,
Jul 1 2016
,
Jul 1 2016
|
|||||||||||||||
►
Sign in to add a comment |
|||||||||||||||
Comment 1 by nepper@chromium.org
, May 13 2016Status: Available (was: Untriaged)