New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 829122 link

Starred by 4 users

Issue metadata

Status: Fixed
Owner:
Closed: Apr 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 1
Type: Bug

Blocking:
issue 807013



Sign in to add a comment

ChromeStorageImpl Shutdown Flaky

Project Member Reported by jonr...@chromium.org, Apr 4 2018

Issue description

OS: Win 7
Test Suites: browser_tests, viz_browser_tests 
            (browser_tests -enable-features=VizDisplayCompositor)

Failing Tests:
  AllowLocalhostErrorsSSLHostStateDelegateTest.LocalhostErrorWithFlag
  ConstrainedWebDialogBrowserTest.ReleaseWebContents
  ExtensionInstallDialogViewInteractiveBrowserTest.InvokeUi_External
  ExtensionInstallDialogViewInteractiveBrowserTest.InvokeUi_Simple
  ManifestVerifierBrowserTest.BobPayCanUseAnyMethodOnOwnOrigin
  PolicyTest.FullscreenAllowedBrowser
  SSLUITest.CertificateTransparencyEnforcementDisabledForUrls/0
  V4SafeBrowsingServiceTest.CheckBrowseUrl

We have seen a flaky shutdown error related to the destruction of ChromeStorageImpl happening on the wrong thread.

First reported flake: https://ci.chromium.org/buildbot/tryserver.chromium.win/win7_chromium_rel_ng/137070

The change that saw the flake only updated a policy json file.

Stack Trace:
 json_pref_store.cc(176)] Check failed: (sequence_checker_).CalledOnValidSequence().
Backtrace:
	base::debug::StackTrace::StackTrace [0x03677780+32]
	base::debug::StackTrace::StackTrace [0x0363D2ED+13]
	logging::LogMessage::~LogMessage [0x035E33F3+83]
	JsonPrefStore::RemoveObserver [0x04768F8F+95]
	ScopedObserver<PrefStore,autofill::ChromeStorageImpl>::RemoveAll [0x06BA2E42+40]
	autofill::ChromeStorageImpl::~ChromeStorageImpl [0x06BA2AFF+27]
	autofill::ChromeStorageImpl::`scalar deleting destructor' [0x06BA2D5B+11]
	base::internal::BindState<std::unique_ptr<autofill::AddressValidator,std::default_delete<autofill::AddressValidator> > (__cdecl*)(std::unique_ptr<i18n::addressinput::Source,std::default_delete<i18n::addressinput::Source> >,std::unique_ptr<i18n::addressinp [0x06BA12C5+37]
	base::internal::CallbackBase::~CallbackBase [0x035F6C19+25]
	base::internal::BindState<void (__thiscall WebRtcRtpDumpHandler::*)(base::RepeatingCallback<void __cdecl(void)> const &,enum RtpDumpType,bool,bool),base::internal::UnretainedWrapper<WebRtcRtpDumpHandler>,base::RepeatingCallback<void __cdecl(void)>,enum Rt [0x04AA13D3+19]
	base::internal::CallbackBase::Reset [0x035F69CF+31]
	base::internal::PostTaskAndReplyImpl::PostTaskAndReply [0x03688852+1298]
	base::internal::PostTaskAndReplyImpl::PostTaskAndReply [0x03688AF3+1971]
	base::internal::CallbackBaseCopyable::operator= [0x035F6999+41]
	base::internal::TaskTracker::RunOrSkipTask [0x036AA2DC+908]
	base::internal::TaskTracker::RunAndPopNextTask [0x036A9819+441]
	base::internal::SchedulerWorker::Thread::ThreadMain [0x036B56AF+1023]
	base::PlatformThread::GetCurrentThreadPriority [0x0360E2B5+469]
	BaseThreadInitThunk [0x7612336A+18]
	RtlInitializeExceptionChain [0x76FD9882+99]
	RtlInitializeExceptionChain [0x76FD9855+54]

Hey rouslan@,

I see that you are an owner for ChromeStorageImpl. Could you help triage this?
 

Comment 1 by kbr@chromium.org, Apr 5 2018

Cc: offenwanger@chromium.org st...@chromium.org
Components: Internals>Preferences Tests>Flaky UI>Browser>Autofill
Labels: -Pri-3 OS-Windows Pri-1
Upgrading to P1. This is affecting potentially a lot of jobs on win7_chromium_rel_ng . I've seen a random flake on for example this other unrelated tryjob:

https://ci.chromium.org/buildbot/tryserver.chromium.win/win7_chromium_rel_ng/137632

Swarming shard:

https://chromium-swarm.appspot.com/task?id=3cab8d2dada38610&refresh=10&show_raw=1

stgao@, this would be an ideal situation for Findit to find similar crashes across multiple test suites.

Comment 2 by kbr@chromium.org, Apr 5 2018

Cc: kbr@chromium.org
Owner: se...@chromium.org
Seb: Did anything change in libaddressinput recently?

Comment 4 by st...@chromium.org, Apr 5 2018

Re #1: Many thanks for the example and suggestion! I filed bug 829398. However, this is a hard problem, and we are already short of hands for current priority tasks, thus we might not be able to look into that any time sooner.

Comment 5 by gab@chromium.org, Apr 10 2018

Description: Show this description

Comment 6 by gab@chromium.org, Apr 10 2018

Blocking: 807013
Cc: se...@chromium.org
Labels: Test-Flaky
Owner: fdoray@chromium.org
Status: Assigned (was: Untriaged)
@fdoray : the flakes appear very much related to r548034 (since the first flake was on a job @ r548038 and the stack is closely related to PostTaskAndReply)

This is making the CQ very flaky, I'll revert your CL until we figure it out.
Project Member

Comment 7 by bugdroid1@chromium.org, Apr 10 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/0e2e01d06ec304d9a9a014d8a6054ce448c4583c

commit 0e2e01d06ec304d9a9a014d8a6054ce448c4583c
Author: Gabriel Charette <gab@chromium.org>
Date: Tue Apr 10 02:56:58 2018

Revert "Reduce leaks in PostTaskAndReplyImpl::PostTaskAndReply (reland)."

This reverts commit 8e10a0af8da100dcb25c92afe395805ca01e577d.

Reason for revert: surfaces flakes in browser_tests, see  crbug.com/829122 
(only reverting base/ changes, test fixes can stay in)

Original change's description:
> Reduce leaks in PostTaskAndReplyImpl::PostTaskAndReply (reland).
>
> This CL first landed as https://chromium-review.googlesource.com/902191.
> It was reverted because of a leak in PipeReaderTest.Cancel, which is
> fixed by this CL by flushing the ScopedTaskEnvironment before the test
> ends.
>
> Before, there was always a leak when the RunTaskAndPostReply callback
> posted by PostTaskAndReplyImpl::PostTaskAndReply didn't run.
>
> With this CL, the "task" is never leaked and the "reply" is only
> leaked if the execution environment is shutdown before the deletion
> happens (e.g. MessageLoop deleted, TaskScheduler shutdown).
>
> TBR=gab@chromium.org,sky@chromium.org,kbr@chromium.org
>
> Bug:  807013 
> Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel
> Change-Id: Ic28839da7aed25fc56a4abadc565b7f4a6e0b5c5
> Reviewed-on: https://chromium-review.googlesource.com/997892
> Reviewed-by: François Doray <fdoray@chromium.org>
> Reviewed-by: Gabriel Charette <gab@chromium.org>
> Reviewed-by: Steven Bennetts <stevenjb@chromium.org>
> Commit-Queue: François Doray <fdoray@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#548919}

TBR=stevenjb@chromium.org,sky@chromium.org,gab@chromium.org,fdoray@chromium.org,kbr@chromium.org

# Not skipping CQ checks because original CL landed > 1 day ago.

Bug:  807013 ,  829122 
Change-Id: Iff4ca23f3cf011b0587478f479a1676e618b741b
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel
Reviewed-on: https://chromium-review.googlesource.com/1003792
Commit-Queue: Gabriel Charette <gab@chromium.org>
Reviewed-by: Gabriel Charette <gab@chromium.org>
Cr-Commit-Position: refs/heads/master@{#549398}
[modify] https://crrev.com/0e2e01d06ec304d9a9a014d8a6054ce448c4583c/base/threading/post_task_and_reply_impl.cc
[modify] https://crrev.com/0e2e01d06ec304d9a9a014d8a6054ce448c4583c/base/threading/post_task_and_reply_impl.h
[modify] https://crrev.com/0e2e01d06ec304d9a9a014d8a6054ce448c4583c/base/threading/post_task_and_reply_impl_unittest.cc

Project Member

Comment 8 by bugdroid1@chromium.org, Apr 10 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/27ebcee438e89fd235a3c7e674294aa199b1cfa5

commit 27ebcee438e89fd235a3c7e674294aa199b1cfa5
Author: Francois Doray <fdoray@chromium.org>
Date: Tue Apr 10 23:54:23 2018

Force deletetion of ChromeStorageImpl on the main thread.

With https://chromium-review.googlesource.com/997892, a task posted
via base::PostTaskWithTraitsAndReplyWithResult() can be deleted on the
sequence on which it was supposed to run if it can't run because of its
shutdown behavior.

The constructor of AddressNormalizerImpl uses
base::PostTaskWithTraitsAndReplyWithResult() to post a task to which
a ChromeStorageImpl is bound. If the task runs before shutdown, the
ChromeStorageImpl is returned to the main thread via the reply.
However, if the task doesn't run before shutdown starts, it is
deleted in the TaskScheduler thread pool (which also means that the
ChromeStorageImpl is deleted in the TaskScheduler thread pool).
An access race occurs when ~ChromeStorageImpl tries to remove itself
from the observer list of its WriteablePrefStore.

To fix this race, this CL uses base::OnTaskRunnerDeleter() to
ensure that the deletion of ChromeStorageImpl occurs on the right
sequence.

Bug:  829122 
Change-Id: I75ba8eab4a884b0f1ac5e571eaaab18f63599d6b
Reviewed-on: https://chromium-review.googlesource.com/1005915
Reviewed-by: Evan Stade <estade@chromium.org>
Commit-Queue: François Doray <fdoray@chromium.org>
Cr-Commit-Position: refs/heads/master@{#549682}
[modify] https://crrev.com/27ebcee438e89fd235a3c7e674294aa199b1cfa5/components/autofill/core/browser/address_normalizer_impl.cc

Comment 9 by se...@chromium.org, Apr 11 2018

Sorry I just saw this thread. Thanks Francois for fixing. I'll take a look at your CL to learn from that mistake.
Status: Fixed (was: Assigned)
sebsg@: Your code wasn't wrong when you wrote it. We are changing the PostTaskAndReply implementation to prevent memory leaks, and that introduces behavior changes to which existing code needs to be adapted :)

Sign in to add a comment