New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 892550 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Oct 30
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 1
Type: Bug
Flaky-Test: SingleClientPollingSyncTest.ShouldUpdatePollPrefs



Sign in to add a comment

sync_integration tests are flaky

Project Member Reported by msramek@chromium.org, Oct 5

Issue description

Tracking bug to de-duplicate many bugs autofilled by Findit.
 
 Issue 892403  has been merged into this issue.
 Issue 892404  has been merged into this issue.
 Issue 892405  has been merged into this issue.
 Issue 892509  has been merged into this issue.
 Issue 892410  has been merged into this issue.
 Issue 892409  has been merged into this issue.
 Issue 892408  has been merged into this issue.
 Issue 892407  has been merged into this issue.
These are all part of a large flake on sync_integration_tests, affecting 71 tests.

The shared cause seems to be:

  [33785:775:1004/030817.207365:ERROR:account_tracker.cc(266)] OnOAuthError
  [33785:775:1004/030817.207408:WARNING:account_tracker.cc(192)] Failed to get UserInfo for gaia_id_for_user@gmail.com
  Closing connection to user@gmail.com/127.0.0.1:53427.chrome-sync
 Issue 892401  has been merged into this issue.
 Issue 892400  has been merged into this issue.
 Issue 892399  has been merged into this issue.
 Issue 892398  has been merged into this issue.
 Issue 892397  has been merged into this issue.
 Issue 892542  has been merged into this issue.
 Issue 892497  has been merged into this issue.
 Issue 892467  has been merged into this issue.
 Issue 892413  has been merged into this issue.
Cc: treib@chromium.org feuunk@chromium.org
 Issue 892412  has been merged into this issue.
 Issue 892411  has been merged into this issue.
 Issue 892406  has been merged into this issue.
 Issue 892402  has been merged into this issue.
 Issue 892396  has been merged into this issue.
 Issue 892395  has been merged into this issue.
 Issue 892394  has been merged into this issue.
Labels: OS-Mac
Re #9: I've seen those messages often even in passing test runs; I don't think they're the cause.

What's interesting is that the flakes are all on Mac only.
Cc: jdoerrie@chromium.org
There is a suspicious crash in the logs:

 sync_integration_tests(29134,0x70001e360000) malloc: *** error for object 0x7fb782073e38: incorrect checksum for freed object - object was probably modified after being freed.
20m 39s
 *** set a breakpoint in malloc_error_break to debug
20m 39s
 Received signal 6
20m 39s
 0   sync_integration_tests              0x000000010f70b7bf base::debug::StackTrace::StackTrace(unsigned long) + 31
20m 39s
 1   sync_integration_tests              0x000000010f70b611 base::debug::(anonymous namespace)::StackDumpSignalHandler(int, __siginfo*, void*) + 2385
20m 39s
 2   libsystem_platform.dylib            0x00007fff65d08f5a _sigtramp + 26
20m 39s
 3   sync_integration_tests              0x000000011751a3f8 base::allocator::AllocatorDispatch::default_dispatch + 0
20m 39s
 4   libsystem_c.dylib                   0x00007fff65aa61ae abort + 127
20m 39s
 5   libsystem_malloc.dylib              0x00007fff65bafad4 szone_error + 596
20m 39s
 6   libsystem_malloc.dylib              0x00007fff65ba4616 tiny_malloc_from_free_list + 1155
20m 39s
 7   libsystem_malloc.dylib              0x00007fff65ba33bf szone_malloc_should_clear + 422
20m 39s
 8   sync_integration_tests              0x000000010f71c59d _ZZN4base9allocator35MallocZoneFunctionsToReplaceDefaultEvEN3$_18__invokeEP14_malloc_zone_tm + 45
20m 39s
 9   sync_integration_tests              0x000000010f71c59d _ZZN4base9allocator35MallocZoneFunctionsToReplaceDefaultEvEN3$_18__invokeEP14_malloc_zone_tm + 45
20m 39s
 10  libsystem_malloc.dylib              0x00007fff65ba31bd malloc_zone_malloc + 103
20m 39s
 11  libsystem_malloc.dylib              0x00007fff65ba24c7 malloc + 24
20m 39s
 12  libc++abi.dylib                     0x00007fff639a3628 operator new(unsigned long) + 40
20m 39s
 13  sync_integration_tests              0x000000010f669d89 base::SequenceCheckerImpl::SequenceCheckerImpl() + 25
20m 39s
 14  sync_integration_tests              0x000000010f6d6668 base::OneShotTimer::OneShotTimer() + 40
20m 39s
 15  sync_integration_tests              0x000000010f4342cf syncer::SyncSchedulerImpl::SyncSchedulerImpl(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, syncer::BackoffDelayProvider*, syncer::SyncCycleContext*, syncer::Syncer*, bool) + 95
20m 39s
 16  sync_integration_tests              0x000000010f3dfc96 syncer::EngineComponentsFactoryImpl::BuildScheduler(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, syncer::SyncCycleContext*, syncer::CancelationSignal*, bool) + 150
20m 39s
 17  sync_integration_tests              0x000000010f42c934 syncer::SyncManagerImpl::Init(syncer::SyncManager::InitArgs*) + 4228
20m 39s
 18  sync_integration_tests              0x000000010f3c3710 syncer::SyncBackendHostCore::DoInitialize(syncer::SyncEngine::InitParams) + 1664
20m 39s
 19  sync_integration_tests              0x000000010f3cb018 void base::internal::FunctorTraits<void (syncer::SyncBackendHostCore::*)(syncer::SyncEngine::InitParams), void>::Invoke<void (syncer::SyncBackendHostCore::*)(syncer::SyncEngine::InitParams), scoped_refptr<syncer::SyncBackendHostCore>, syncer::SyncEngine::InitParams>(void (syncer::SyncBackendHostCore::*)(syncer::SyncEngine::InitParams), scoped_refptr<syncer::SyncBackendHostCore>&&, syncer::SyncEngine::InitParams&&) + 152
20m 39s
 20  sync_integration_tests              0x000000010f5f59c1 base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) + 321
20m 39s
 21  sync_integration_tests              0x000000010f626b4e base::MessageLoop::RunTask(base::PendingTask*) + 286
20m 39s
 22  sync_integration_tests              0x000000010f626f73 base::MessageLoop::DoWork() + 387
20m 39s
 23  sync_integration_tests              0x000000010f62a468 base::MessagePumpDefault::Run(base::MessagePump::Delegate*) + 216
20m 39s
 24  sync_integration_tests              0x000000010f6266e4 base::MessageLoop::Run(bool) + 132
20m 39s
 25  sync_integration_tests              0x000000010f662829 base::RunLoop::Run() + 249
20m 39s
 26  sync_integration_tests              0x000000010f6d008e base::Thread::Run(base::RunLoop*) + 206
20m 39s
 27  sync_integration_tests              0x000000010f6d0607 base::Thread::ThreadMain() + 839
20m 39s
 28  sync_integration_tests              0x000000010f71b99f base::(anonymous namespace)::ThreadFunc(void*) + 95
20m 39s
 29  libsystem_pthread.dylib             0x00007fff65d12661 _pthread_body + 340
20m 39s
 30  libsystem_pthread.dylib             0x00007fff65d1250d _pthread_body + 0
20m 39s
 31  libsystem_pthread.dylib             0x00007fff65d11bf9 thread_start + 13
20m 39s
 [end of stack trace]
20m 39s
 [69/527] TwoClientAutocompleteSyncTest.WebDataServiceSanity (CRASHED)
20m 39s


I've looked at the history of the files in the crash, and the only change that happened in the time range seems to be https://chromium-review.googlesource.com/c/chromium/src/+/1257849 by jdoerrie, but that seems unrelated.
Cc: palmer@chromium.org
Owner: palmer@chromium.org
Another possible suspect: https://chromium-review.googlesource.com/1256211

palmer@: any objection to reverting speculatively?


Does the bug still happen if you turn the optimization back off on macOS?

https://chromium-review.googlesource.com/c/chromium/src/+/1256211/6/base/allocator/partition_allocator/partition_bucket.cc#b481

I'd rather speculatively land that `#if defined` than speculatively revert. If possible. Seems like close to the same amount of work either way?
Also, the stack trace in #28 looks like you're using libc malloc and not Partition Alloc? In which case something totally else is going on. Presumably.
Project Member

Comment 32 by Findit, Oct 5

Flaky-Test: SingleClientPollingSyncTest.ShouldUpdatePollPrefs
Labels: Test-Flaky Test-Findit-Detected Sheriff-Chromium

SingleClientPollingSyncTest.ShouldUpdatePollPrefs is flaky.

Findit has detected 8 new flake occurrences of this test. List
of all flake occurrences can be found at:
https://findit-for-me.appspot.com/flake/occurrences?key=ag9zfmZpbmRpdC1mb3ItbWVyXAsSBUZsYWtlIlFjaHJvbWl1bUBzeW5jX2ludGVncmF0aW9uX3Rlc3RzQFNpbmdsZUNsaWVudFBvbGxpbmdTeW5jVGVzdC5TaG91bGRVcGRhdGVQb2xsUHJlZnMM.

Since this test is still flaky, this issue has been moved back onto the Sheriff
Bug Queue if it's not already there.

This flaky test was previously tracked in  bug 892405 .

If the result above is wrong, please file a bug using this link:
https://bugs.chromium.org/p/chromium/issues/entry?status=Unconfirmed&labels=Pri-1,Test-Findit-Wrong&components=Tools%3ETest%3EFindit%3EFlakiness&summary=%5BFindit%5D%20Flake%20Detection%20-%20Wrong%20result%20for%20SingleClientPollingSyncTest.ShouldUpdatePollPrefs&comment=Link%20to%20flake%20occurrences%3A%20https://findit-for-me.appspot.com/flake/occurrences?key=ag9zfmZpbmRpdC1mb3ItbWVyXAsSBUZsYWtlIlFjaHJvbWl1bUBzeW5jX2ludGVncmF0aW9uX3Rlc3RzQFNpbmdsZUNsaWVudFBvbGxpbmdTeW5jVGVzdC5TaG91bGRVcGRhdGVQb2xsUHJlZnMM

Automatically posted by the findit-for-me app (https://goo.gl/Ot9f7N).
 Issue 892668  has been merged into this issue.
 Issue 892696  has been merged into this issue.
 Issue 892739  has been merged into this issue.
 Issue 892752  has been merged into this issue.
 Issue 892802  has been merged into this issue.
Project Member

Comment 38 by bugdroid1@chromium.org, Oct 6

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/a7527ee7956b26f1ab1fa9cd165b7c06765ed790

commit a7527ee7956b26f1ab1fa9cd165b7c06765ed790
Author: Chris Palmer <palmer@chromium.org>
Date: Sat Oct 06 01:28:25 2018

Speculatively turn off the `PartitionAllocZeroFill` optimization on macOS.

It might? be the cause of some test flakage.

Bug:  892550 
TBR: ajwong
Change-Id: I4ac98e2e5e3d02526ed2c187157d097b82adc756
Reviewed-on: https://chromium-review.googlesource.com/c/1266199
Commit-Queue: Chris Palmer <palmer@chromium.org>
Reviewed-by: Kentaro Hara <haraken@chromium.org>
Reviewed-by: Marijn Kruisselbrink <mek@chromium.org>
Cr-Commit-Position: refs/heads/master@{#597390}
[modify] https://crrev.com/a7527ee7956b26f1ab1fa9cd165b7c06765ed790/base/allocator/partition_allocator/partition_bucket.cc

Labels: -Sheriff-Chromium
It doesn't seem like the patch above improved anything. Naive question before we move on to the next suspect: why was the last patch only a partial revert of the original? Were the rest of the codepaths even less likely to be relevant? Thx!
Status: Assigned (was: Available)
palmer@: friendly ping for the question above, thanks!
Sorry; I was out sick.

I'm not surprised the CL didn't affect anything. Yes, the other codepaths are even less likely to be relevant.
Owner: ----
Status: Available (was: Assigned)
Is this bug still happening? I'm not sure there's anything else I can do.
Owner: mastiz@chromium.org
Status: Assigned (was: Available)
I can take it from here, thanks!
Project Member

Comment 45 by bugdroid1@chromium.org, Oct 24

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/b7c1cc058ddfd733a2734bbea03d2e11f76b8ae0

commit b7c1cc058ddfd733a2734bbea03d2e11f76b8ae0
Author: Mikel Astiz <mastiz@chromium.org>
Date: Wed Oct 24 06:51:08 2018

Remove switches for ModelTypeStoreBackend performance tweaks

These features are enabled by default with M71 and seem effective
according to experiments on canary&dev, so there's no need to keep these
feature toggles around.

These features may also be the cause for certain unrelated test
flakiness on TSAN, presumably because test-only feature overrides are
not thread safe.

Bug: 887068, 892550 
Change-Id: Ia57e1f161f726db05e2322568da51d3681df8751
Reviewed-on: https://chromium-review.googlesource.com/c/1296504
Reviewed-by: Jan Krcal <jkrcal@chromium.org>
Commit-Queue: Mikel Astiz <mastiz@chromium.org>
Cr-Commit-Position: refs/heads/master@{#602265}
[modify] https://crrev.com/b7c1cc058ddfd733a2734bbea03d2e11f76b8ae0/components/sync/model_impl/model_type_store_backend.cc

Comment 46 Deleted

Previous message was meant for another bug and hence I deleted it.

Here's a relevant flakiness dashboard for this bug: https://test-results.appspot.com/dashboards/flakiness_dashboard.html#showAllRuns=true&testType=sync_integration_tests&tests=ShouldUpdatePollPrefs

There are some flakes there.
Status: Fixed (was: Assigned)
Most flakes are gone modulo false alerts so marking this as fixed.

Sign in to add a comment