Project: chromium Issues People Development process History Sign in
New issue
Advanced search Search tips
Issue 702300 "ContentHashFetcherTest.MissingVerifiedContents" is flaky
Starred by 1 user Project Member Reported by chromium...@appspot.gserviceaccount.com, Mar 16 Back to list
Status: Fixed
Owner:
Closed: Apr 19
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment
"ContentHashFetcherTest.MissingVerifiedContents" is flaky.

This issue was created automatically by the chromium-try-flakes app. Please find the right owner to fix the respective test/step and assign this issue to them. If the step/test is infrastructure-related, please add Infra-Troopers label and change issue status to Untriaged. When done, please remove the issue from Sheriff Bug Queue by removing the Sheriff-Chromium label.

We have detected 5 recent flakes. List of all flakes can be found at https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyOQsSBUZsYWtlIi5Db250ZW50SGFzaEZldGNoZXJUZXN0Lk1pc3NpbmdWZXJpZmllZENvbnRlbnRzDA.

Flaky tests should be disabled within 30 minutes unless culprit CL is found and reverted. Please see more details here: https://sites.google.com/a/chromium.org/dev/developers/tree-sheriffs/sheriffing-bug-queues#triaging-auto-filed-flakiness-bugs
 
Labels: -Sheriff-Chromium
Owner: fdoray@chromium.org
I suspect https://codereview.chromium.org/2672623006 here

ContentHashFetcherTest.MissingVerifiedContents (run #1):
[ RUN      ] ContentHashFetcherTest.MissingVerifiedContents
[2696:2703:0317/052742.974648:858251351:FATAL:url_fetcher_core.cc(127)] Check failed: delegate_task_runner_->RunsTasksOnCurrentThread().
#0 0x000001cbf137 base::debug::StackTrace::StackTrace()
#1 0x000001cd0bea logging::LogMessage::~LogMessage()
#2 0x00000256d721 net::URLFetcherCore::Stop()
#3 0x0000024a9c21 net::URLFetcherImpl::~URLFetcherImpl()
#4 0x0000024a9c69 net::URLFetcherImpl::~URLFetcherImpl()
#5 0x000001b334d4 extensions::ContentHashFetcherJob::~ContentHashFetcherJob()
#6 0x000001b33569 extensions::ContentHashFetcherJob::~ContentHashFetcherJob()
#7 0x00000097bb74 _ZN4base8internal9BindStateIMN7content19AppCacheStorageImpl12DatabaseTaskEFvvEJ13scoped_refptrIS4_EEE7DestroyEPKNS0_13BindStateBaseE
#8 0x000001cb854f base::internal::CallbackBase<>::operator=()
#9 0x000001d14cf8 base::SequencedWorkerPool::Inner::ThreadLoop()
#10 0x000001d143dc base::SequencedWorkerPool::Worker::Run()
#11 0x000001d187b5 base::SimpleThread::ThreadMain()
#12 0x000001d1329c base::(anonymous namespace)::ThreadFunc()
#13 0x7feaf795d184 start_thread
#14 0x7feaf339537d clone

Labels: Sheriff-Chromium
Detected 3 new flakes for test/step "ContentHashFetcherTest.MissingVerifiedContents". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyOQsSBUZsYWtlIi5Db250ZW50SGFzaEZldGNoZXJUZXN0Lk1pc3NpbmdWZXJpZmllZENvbnRlbnRzDA. This message was posted automatically by the chromium-try-flakes app. Since flakiness is ongoing, the issue was moved back into Sheriff Bug Queue (unless already there).
Detected 5 new flakes for test/step "ContentHashFetcherTest.MissingVerifiedContents". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyOQsSBUZsYWtlIi5Db250ZW50SGFzaEZldGNoZXJUZXN0Lk1pc3NpbmdWZXJpZmllZENvbnRlbnRzDA. This message was posted automatically by the chromium-try-flakes app.
Detected 3 new flakes for test/step "ContentHashFetcherTest.MissingVerifiedContents". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyOQsSBUZsYWtlIi5Db250ZW50SGFzaEZldGNoZXJUZXN0Lk1pc3NpbmdWZXJpZmllZENvbnRlbnRzDA. This message was posted automatically by the chromium-try-flakes app.
Detected 4 new flakes for test/step "ContentHashFetcherTest.MissingVerifiedContents". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyOQsSBUZsYWtlIi5Db250ZW50SGFzaEZldGNoZXJUZXN0Lk1pc3NpbmdWZXJpZmllZENvbnRlbnRzDA. This message was posted automatically by the chromium-try-flakes app.
Labels: -Sheriff-Chromium
Status: Assigned
Labels: Sheriff-Chromium
Detected 5 new flakes for test/step "ContentHashFetcherTest.MissingVerifiedContents". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyOQsSBUZsYWtlIi5Db250ZW50SGFzaEZldGNoZXJUZXN0Lk1pc3NpbmdWZXJpZmllZENvbnRlbnRzDA. This message was posted automatically by the chromium-try-flakes app. Since flakiness is ongoing, the issue was moved back into Sheriff Bug Queue (unless already there).
Labels: -Sheriff-Chromium
This is still popping up in the Sheriff queue. Any progress that we can make here?
I know how to fix this, but don't have enough time to implement the fix right now. I created a revert of https://codereview.chromium.org/2672623006 in the meantime (https://codereview.chromium.org/2787453003/).


Labels: Sheriff-Chromium
Detected 3 new flakes for test/step "ContentHashFetcherTest.MissingVerifiedContents". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyOQsSBUZsYWtlIi5Db250ZW50SGFzaEZldGNoZXJUZXN0Lk1pc3NpbmdWZXJpZmllZENvbnRlbnRzDA. This message was posted automatically by the chromium-try-flakes app. Since flakiness is ongoing, the issue was moved back into Sheriff Bug Queue (unless already there).
Labels: -Sheriff-Chromium
Labels: Sheriff-Chromium
Detected 3 new flakes for test/step "ContentHashFetcherTest.MissingVerifiedContents". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyOQsSBUZsYWtlIi5Db250ZW50SGFzaEZldGNoZXJUZXN0Lk1pc3NpbmdWZXJpZmllZENvbnRlbnRzDA. This message was posted automatically by the chromium-try-flakes app. Since flakiness is ongoing, the issue was moved back into Sheriff Bug Queue (unless already there).
Cc: fdoray@chromium.org
Components: Platform>Extensions
Owner: lazyboy@chromium.org
Reverting my CL didn't fix the problem.

The issue is that the net::URLFetcher owned by ContentHashFetcherJob is not deleted on the sequence from which it is created.

IIUC, the net::URLFetcher is created on a browser thread in ContentHashFetcherJob::DoneCheckingForVerifiedContents().

It is then destroyed when the last reference to the ContentHashFetcherJob is dropped. Unfortunately, multiple sequences hold a reference to ContentHashFetcherJob...

E.g.

Reference owned by the reply running on a browser thread: https://cs.chromium.org/chromium/src/extensions/browser/content_hash_fetcher.cc?l=206

Reference owned by a task running in the blocking pool: https://cs.chromium.org/chromium/src/extensions/browser/content_hash_fetcher.cc?l=344 

Assigning to extensions/ owner.
Cc: rdevlin....@chromium.org
 Issue 702891  has been merged into this issue.
ContentHashFetcherJob ref count thing is a bit wonky, I guess it's because we expect to call Cancel() on it from any thread?
Labels: -Sheriff-Chromium
Removing Sheriff-Chromium as it is assigned.

This popped up again on sheriff-o-matic today - I'll disable ContentHashFetcherTest.MissingVerifiedContents on Linux.
Project Member Comment 19 by bugdroid1@chromium.org, Apr 7
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/d821806d4441025512588c514e66756323eb1408

commit d821806d4441025512588c514e66756323eb1408
Author: lukasza <lukasza@chromium.org>
Date: Fri Apr 07 18:35:32 2017

Disable flaky ContentHashFetcherTest.MissingVerifiedContents on Linux.

BUG= 702300 
TBR=lazyboy@chromium.org

Review-Url: https://codereview.chromium.org/2805443005
Cr-Commit-Position: refs/heads/master@{#462936}

[modify] https://crrev.com/d821806d4441025512588c514e66756323eb1408/extensions/browser/content_hash_fetcher_unittest.cc

I've identified and finally managed to repro the failure.

The last ref to ContentHashFetcherJob (CHFJ) is released from ContentHashFetcherWaiter::WaitForCallback(), however if the destruction of CHFJ comes after the test class's TestBrowserThreadBundle destruction, there won't be any thread and the test will fail.
Narrowed down the root cause:
PostTask T [1] from blocking pool [2] sometimes causes T to run synchronously.

I have a isolated repro here: https://codereview.chromium.org/2824033002/
To repro you'd run the test repeatedly [3] while your machine is under stress [4].

fdoray@, do you have any insight to this?


[1]
https://cs.chromium.org/chromium/src/extensions/browser/content_hash_fetcher.cc?rcl=fb11d048bac3ddc72bb78399e5b7e8057fd050c7&l=342

[2]
https://cs.chromium.org/chromium/src/extensions/browser/content_hash_fetcher.cc?rcl=fb11d048bac3ddc72bb78399e5b7e8057fd050c7&l=362

[3]
$ ./out/Release/extensions_unittests --enable-pixel-output-in-tests --gtest_filter=ContentHashFetcherTest.TestCase --gtest_repeat=200 

[4]
For me I'd either run another instance of chrome build w/o goma:
$ ninja -C out/Debug chrome -j 1024
or using stress tool (random numbers below)
$ stress --cpu 2048 -i 512 --timeout 60
What is the expected failure when trying the repro https://codereview.chromium.org/2824033002/?
@22, if you run the tests repeatedly, the particular test that hits the case will fail. In particular:
ASSERT_TRUE(job->post_task_finished());
will fail because the callback was called during BrowserThread::PostTask.

A successful test log will look like:
Task1OnBlockingPool BEG
Task1OnBlockingPool END
Task2OnCreationThread
JobWaiter::Callback

A failed test log looks like:
Task1OnBlockingPool BEG
Task2OnCreationThread <- called during PostTask
JobWaiter::Callback
Task1OnBlockingPool END
The blocking pool has its own threads which can run tasks in parallel with the main thread. As soon as Task1OnBlockingPool() posts Task2OnCreationThread() to |creation_thread_|, Task2OnCreationThread() can run. The API doesn't say that Task1OnBlockingPool() has to finish running before Task2OnCreationThread() runs.
Project Member Comment 25 by bugdroid1@chromium.org, Apr 19
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/825099a68908767ecd35d79b0888887c4ffccd0b

commit 825099a68908767ecd35d79b0888887c4ffccd0b
Author: lazyboy <lazyboy@chromium.org>
Date: Wed Apr 19 02:12:04 2017

Fix two ContentHashFetcherTest.* tests' flakiness.

ContentHashFetcherJob::url_fetcher_ is always created on
ContentHashFetcherJob::creation_thread_ but it was possible for
it to be destroyed on blocking pool with the destruction of
ContentHashFetcherJob. This causes test flakiness, and this CL
makes sure we always destroy url_fetcher_ on correct thread.

As an optimization for the most common case, also destroy
ContentHashFetcherJob::url_fetcher_ once we're done with it (earlier
than above). However, note that because jobs can be cancelled, we
still need to make sure to destroy url_fetcher_ on correct thread from
ContentHashFetcherJob destructor.

BUG= 702300 
Test=No visible change. Fixing test flakiness and potential ref issue in
content verification code.
To repro the flake, I have put my machine on heavy load (compiling
chromium without goma and -j 1024), then ran these tests repeatedly
with --gtest_repeat flag. Tests will be flaky without this change, but
they always pass with this CL.

Review-Url: https://codereview.chromium.org/2815243002
Cr-Commit-Position: refs/heads/master@{#465468}

[modify] https://crrev.com/825099a68908767ecd35d79b0888887c4ffccd0b/extensions/browser/content_hash_fetcher.cc
[modify] https://crrev.com/825099a68908767ecd35d79b0888887c4ffccd0b/extensions/browser/content_hash_fetcher_unittest.cc

Status: Fixed
Sign in to add a comment