New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 603884 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Last visit > 30 days ago
Closed: May 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 1
Type: Bug

Blocking:
issue 584266



Sign in to add a comment

Duplicate snippets

Project Member Reported by nepper@chromium.org, Apr 15 2016

Issue description

Version: 51.0.2704.5
OS: Android

I love Game of Thrones, but getting the same card twice doesn't seem right :).

What is the expected output?

Have any snippet just show up once

What do you see instead?

A snippet showed up twice


Please use labels and text to provide additional information.

 
Screenshot_20160415-101950.png
325 KB View Download

Comment 1 by nepper@chromium.org, Apr 15 2016

Owner: dgn@chromium.org
Status: Assigned (was: Available)

Comment 2 by dgn@chromium.org, Apr 19 2016

I have a hard time reproducing it. I think the issue is coming from what ChromeReader sends us. Can you please use chrome://snippets-internal to send me examples of duplicates?

Comment 3 by dgn@chromium.org, Apr 19 2016

Also, did you have --ntp-snippets-dont-restrict set? (that's also visible from chrome://snippets-internal)

Comment 4 by nepper@chromium.org, Apr 19 2016

I'll report a case as soon as I see it.

No, didn't have that flag enabled. What is it for?

Comment 5 by nepper@chromium.org, Apr 19 2016

I have this case again, but chrome://snippets-internal does not exist in the latest Canary.

Duplicates:
http://edition.cnn.com/2016/04/18/asia/philippines-rodrigo-duterte-gang-rape-comment-apology/
http://m.wdsu.com/national/philippine-candidate-sorry-over-gang-rape-joke/39096116

Comment 6 by nepper@chromium.org, Apr 19 2016

Cc: dgn@chromium.org
Owner: maybelle@chromium.org
May, I think this should get resolved when your change regarding content info lands:

When there are multiple URLs with the same content, then contentInfo.url points to just one of them. But the source_corpus_info will have the correct url based on the restrict.

Do you have a bug on file? If yes, we may want to dupe this one against it.
We don't have a bug for it specifically, I had as part of my favicon change. 

Note that if there are multiple sources within your host restrict that have the article (e.g. you have both nytimes.com and cnn.com and they both carry the same story), duplicates could still happen unless the source_corpus_info ordering is deterministic. Right now we use the first source_corpus_info we find as the source of the article, but if that order changes when we get another set of data from ChromeReader, we could have dupes.

It's worth keeping around this bug to fix for that case (e.g., look at all the available sources and de-dupe based on that).

Comment 8 by fi...@chromium.org, Apr 22 2016

Blocking: 584266
Labels: zine-mr-iter-13
Starting on this after 596410 lands, as it depends on it.
Also make sure DiscardSnippet() uses the same de-duping logic.
 Issue 596410  is fixed. I guess this is next now? :)
Owner: ----
Status: Available (was: Assigned)
Well,  Issue 609112  has higher pri on my list. In fact, I'll unassign for now to allow others to take this if they want as it's a pretty independent change. I'll pick it up later if no one else has.
Cc: maybelle@chromium.org
When fixing this bug, make sure to make a pass over SnippetsService and fix all the places that rely only on the main URL as the identifier of a snippet.

Comment 15 by treib@chromium.org, May 10 2016

I have found another repro of this - two "identical" snippets where the main URL differs only by a query param. Screenshot attached.
2016-05-10.png
337 KB View Download
Owner: tschumann@chromium.org
Status: Assigned (was: Available)
Project Member

Comment 18 by bugdroid1@chromium.org, May 18 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/ab78ef3824631e1ef5fffba7c21946d66b9ad84f

commit ab78ef3824631e1ef5fffba7c21946d66b9ad84f
Author: tschumann <tschumann@chromium.org>
Date: Wed May 18 20:42:27 2016

Use multiple IDs when discarding or merging snippets.

Changes the discarding and de-duplication logic to take IDs of all snippet
sources (these are always URLs in our cases) into account.

R=treib@chromium.org
BUG= 603884 

Review-Url: https://codereview.chromium.org/1992803002
Cr-Commit-Position: refs/heads/master@{#394539}

[modify] https://crrev.com/ab78ef3824631e1ef5fffba7c21946d66b9ad84f/components/ntp_snippets/ntp_snippets_service.cc
[modify] https://crrev.com/ab78ef3824631e1ef5fffba7c21946d66b9ad84f/components/ntp_snippets/ntp_snippets_service_unittest.cc

Status: Fixed (was: Assigned)
Status: Verified (was: Fixed)
verified in M52-52.0.2743.8
Labels: zine-mr-MVP
Labels: -zine-mr-mvp
Labels: zine-mr-MVP

Sign in to add a comment