Project: chromium Issues People Development process History Sign in
New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.
Starred by 4 users
Status: Verified
Owner:
Last visit > 30 days ago
Closed: Sep 2012
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug

Restricted
  • Only users with EditIssue permission may comment.



Sign in to add a comment
Sync invalidation notifications broken after browser restart
Reported by rlarocque@chromium.org, Jul 27 2012 Back to list
We recently fixed a notifications issue, but it seems a new one has popped up.  It may or may not be related to the original.  

When I create two new profiles and sync them to the same account, notifications work very well.  In fact, the client behaviour suggests that  issue 133125  has been fixed.  

Then I close and reopen the browsers.  At that point, notifications stop working.  One of the browsers seemed to see a single notification around the time that it initially registered, but it receives none afterwards.  

This was tested on ToT (r148792, M22) on Linux.  
 
Others are seeing intermittent repros.  This could be a race condition.

I see that the request to set the registered types arrives at ChromeInvalidationClient::RegisterIds() when the ticl_state_ is TRANSIENT_NOTIFICATION_ERROR.  This prevents the call from reaching the RegistrationManager where it would actually take effect.  

The code should be prepared to handle this possibility.  It saves the set of registered IDs in a member variable and will attempt to register them when the transition to a ready state is signaled with ChromeInvalidationClient::Ready().  It looks like that function is never being called, though.
This might explain it:

There are two inputs to GetState() in ChromeInvalidationClient: ticl_state_ and push_client_state_.  At the time when my client sets enabled types, ticl_state_ is 2 (OK/initialized) while push_client_state_ is 0 (uninitialized).  GetState() returns 0.

When ticl_state_ transitions from 0 -> 2, it does so through ChromeInvalidationClient::Ready(), which calls into the RegistrationManager to set our types.

I'm pretty sure there's a bug here somewhere.  Either ChromeInvalidationClient::RegisterIds() should check ticl_state_ instead of using GetState() to check both ticl_state_ and push_client_state_, or the initialization of push_client_state_ should trigger a call to the RegistrationManager.
Comment 3 by akalin@chromium.org, Jul 28 2012
Hunh, interesting.  So ticl_state_ can go to OK/initialized even when push_client_state_ is 0 (XMPP isn't up yet).  That's strange, but I suppose it's possible since the messages on the sync thread can race, and/or the ticl can call itself initialized even if there's no network connection (e.g., when restoring from persisted state).

I think the right solution is to make RegisterIds check ticl_state_ only.
Comment 4 by akalin@chromium.org, Jul 28 2012
Looks like the bug was introduced in r142806.  M21 cut in r142910.  So this is in M21. :(( So when M21 hits stable, I expect a drop in notifications traffic.
Project Member Comment 5 by bugdroid1@chromium.org, Jul 28 2012
The following revision refers to this bug:
    http://src.chromium.org/viewvc/chrome?view=rev&revision=148907

------------------------------------------------------------------------
r148907 | akalin@chromium.org | 2012-07-28T23:06:29.348423Z

Changed paths:
   M http://src.chromium.org/viewvc/chrome/trunk/src/sync/notifier/chrome_invalidation_client.cc?r1=148907&r2=148906&pathrev=148907

[Sync] Fix race condition with notification startup

This fixes a race condition which may cause notifications to not work
on startup.

This is a minimal change for ease of merging.  Unit tests will be landed
in a separate patch.

BUG= 139424 


Review URL: https://chromiumcodereview.appspot.com/10836002
------------------------------------------------------------------------
I tested by applying the patch manually to yesterday's trunk.  I left in the log statements I used to detect the bug originally.  I can confirm that my client "lost" the race, but received notifications anyway.

In addition to adding more unit tests, I think we should do something to increase visibility into the notifier.  Otherwise, it will be difficult to verify this fix actually is actually working (as opposed to the client experiencing a good race outcome), without additional log statements.  
Comment 7 by akalin@chromium.org, Jul 29 2012
Yeah, I agree.  Perhaps we could add some more data to the notifications tab of about:sync, at least the set of registered types (although we have to be careful that we're getting the 'right' set).
Comment 8 by akalin@chromium.org, Jul 29 2012
How to check if a client is affected:

Run with:

--enable-logging --vmodule=chrome_system_resources=1

Look at chrome_debug.log for lines like:

[17710:-1255550976:5484803709643:VERBOSE1:invalidation-client-impl.cc(374)] Register { source: 1004 name: "APP"  }, 1
[17710:-1255550976:5484803762418:VERBOSE1:invalidation-client-impl.cc(374)] Register { source: 1004 name: "APP_NOTIFICATION"  }, 1
[17710:-1255550976:5484803796341:VERBOSE1:invalidation-client-impl.cc(374)] Register { source: 1004 name: "APP_SETTING"  }, 1
...

If they are present, then that particular instance has 'won' the race and notifications are working properly.  If they are not, then that particular instance has 'lost' the race and notifications are not working properly.

Make sure you're not running into http://code.google.com/p/chromium/issues/detail?id=138595 .
Comment 9 by akalin@chromium.org, Jul 29 2012
Handy one-liner to test an instance of Chrome (here Chrome Canary on OS X)

bash -c '(sleep 5; kill $$) & exec /Applications/Google\ Chrome\ Canary.app/Contents/MacOS/Google\ Chrome\ Canary --enable-logging=stderr --vmodule=chrome_system_resources=1' 2>&1 | grep -c 'invalidation-client-impl.*Register'

If it prints out 0, then there were no registrations.  If it prints out a non-zero number, then there were registrations (should be a multiple of 15 if all data types are enabled).
Equivalent one-liner for PowerShell:

$p = Start-Process -FilePath 'C:\Users\akalin\AppData\Local\Google\Chrome SxS\Application\chrome.exe' -ArgumentList '--enable-logging,--vmodule=chrome_system_resources=1' -PassThru; if (!$p.WaitForExit(5000) ) { $p.kill () } cat 'C:\users\akalin\AppData\Local\Google\Chrome SxS\User Data\chrome_debug.log' | where {$_ -match "invalidation-client-impl.*Register"} | Measure-Object -Line
Actually, the timeout should probably be 10s instead of 5s.
Fred, is this the right bug to tag with M21 and merge-request once we're happy with it in canary?
Labels: -Pri-2 Pri-1 Mstone-21 Merge-Requested
Yes.  I think based on Richard's testing, this should be good to merge, no?
I just noticed this when running sync_unit_tests under valgrind for unrelated reasons:

==3250== Process terminating with default action of signal 5 (SIGTRAP)
==3250==    at 0x4ED8591: base::debug::BreakDebugger() (debugger_posix.cc:241)
==3250==    by 0x4F132CA: logging::RawLog(int, char const*) (logging.cc:850)
==3250==    by 0x751BB1: base::TestSuite::UnitTestAssertHandler(std::string const&) (test_suite.cc:233)
==3250==    by 0x4F12E13: logging::LogMessage::~LogMessage() (logging.cc:652)
==3250==    by 0x6C7EED: syncer::SyncNotifierHelper::UpdateRegisteredIds(syncer::SyncNotifierObserver*, std::set<invalidation::ObjectId, syncer::ObjectIdLessThan, std::allocator<invalidation::ObjectId> > const&) (sync_notifier_helper.cc:44)
==3250==    by 0x62C421: syncer::SyncNotifierHelperTest_MultipleRegistration_Test::TestBody() (sync_notifier_helper_unittest.cc:117)
==3250==    by 0x6BEEE2: void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (gtest.cc:2071)
==3250==    by 0x6B5E1D: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (gtest.cc:2123)
==3250==    by 0x6AD484: testing::Test::Run() (gtest.cc:2142)
==3250==    by 0x6ADB8A: testing::TestInfo::Run() (gtest.cc:2319)
==3250==    by 0x6AE0F6: testing::TestCase::Run() (gtest.cc:2426)
==3250==    by 0x6B23F5: testing::internal::UnitTestImpl::RunAllTests() (gtest.cc:4249)
==3250==    by 0x6BBAD2: bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) (gtest.cc:2071)
==3250==    by 0x6B74ED: bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) (gtest.cc:2123)
==3250==    by 0x6B20E3: testing::UnitTest::Run() (gtest.cc:3882)
==3250==    by 0x751A84: base::TestSuite::Run() (test_suite.cc:199)
==3250==    by 0x6ECA53: main (run_all_unittests.cc:10)

I can't help but think this may be related.  Fred, could you try repro this?  Let's not merge until this is fully understood.
Comment 15 by kareng@google.com, Jul 30 2012
pls ping me when u think you're ready.
Richard, that particular test runs a death test, so the CHECK failure is expected.
Talked to Richard, we think this is ready to merge.
Comment 18 by kareng@google.com, Aug 1 2012
ok so this made it to dev yesterday right? It's working as expected and we're happy with the results?
Yes, and yes.
Comment 20 by kareng@google.com, Aug 1 2012
Labels: -Merge-Requested Merge-Approved
My PowerShell one-liner above is buggy: It doesn't output to chrome_debug.log properly.  Here's a fixed version:

$t = 10000; $dir = 'C:\users\akalin\AppData\Local\Google\Chrome'; $l = "$dir\User Data\chrome_debug.log"; $e = "$dir\Application\chrome.exe"; if (Test-Path $l) { rm $l }; $p = Start-Process -FilePath $e -ArgumentList '--enable-logging', '--vmodule=chrome_system_resources=1' -PassThru; if (! $p.WaitForExit($t) ) { $p.kill() }; cat $l | where {$_ -match 'invalidation-client-impl.*Register'} | Measure-Object -Line
Project Member Comment 22 by bugdroid1@chromium.org, Aug 1 2012
Labels: -Merge-Approved merge-merged-1180
The following revision refers to this bug:
    http://src.chromium.org/viewvc/chrome?view=rev&revision=149443

------------------------------------------------------------------------
r149443 | akalin@chromium.org | 2012-08-01T18:00:06.294025Z

Changed paths:
   M http://src.chromium.org/viewvc/chrome/branches/1180/src/sync/notifier/chrome_invalidation_client.cc?r1=149443&r2=149442&pathrev=149443

Merge 148907 - [Sync] Fix race condition with notification startup

This fixes a race condition which may cause notifications to not work
on startup.

This is a minimal change for ease of merging.  Unit tests will be landed
in a separate patch.

BUG= 139424 


Review URL: https://chromiumcodereview.appspot.com/10836002

TBR=akalin@chromium.org
Review URL: https://chromiumcodereview.appspot.com/10828116
------------------------------------------------------------------------
Project Member Comment 23 by bugdroid1@chromium.org, Aug 3 2012
Labels: merge-merged-1180_57
The following revision refers to this bug:
    http://src.chromium.org/viewvc/chrome?view=rev&revision=149899

------------------------------------------------------------------------
r149899 | tim@chromium.org | 2012-08-03T19:51:20.927994Z

Changed paths:
   M http://src.chromium.org/viewvc/chrome/branches/1180_57/src/sync/notifier/chrome_invalidation_client.cc?r1=149899&r2=149898&pathrev=149899

Merge 148907 - [Sync] Fix race condition with notification startup

This fixes a race condition which may cause notifications to not work
on startup.

This is a minimal change for ease of merging.  Unit tests will be landed
in a separate patch.

BUG= 139424 


Review URL: https://chromiumcodereview.appspot.com/10836002

TBR=akalin@chromium.org
Review URL: https://chromiumcodereview.appspot.com/10832143
------------------------------------------------------------------------
Labels: VerifiedIn-M21
Verified fix, build 21.0.1180.64.
Project Member Comment 25 by bugdroid1@chromium.org, Aug 9 2012
The following revision refers to this bug:
    http://src.chromium.org/viewvc/chrome?view=rev&revision=150665

------------------------------------------------------------------------
r150665 | akalin@chromium.org | 2012-08-09T01:11:32.661140Z

Changed paths:
   D http://src.chromium.org/viewvc/chrome/trunk/src/sync/notifier/mock_invalidation_state_tracker.h?r1=150665&r2=150664&pathrev=150665
   M http://src.chromium.org/viewvc/chrome/trunk/src/sync/sync.gyp?r1=150665&r2=150664&pathrev=150665
   M http://src.chromium.org/viewvc/chrome/trunk/src/sync/notifier/chrome_invalidation_client_unittest.cc?r1=150665&r2=150664&pathrev=150665
   M http://src.chromium.org/viewvc/chrome/trunk/src/sync/notifier/chrome_invalidation_client.cc?r1=150665&r2=150664&pathrev=150665
   M http://src.chromium.org/viewvc/chrome/trunk/src/sync/notifier/chrome_invalidation_client.h?r1=150665&r2=150664&pathrev=150665
   M http://src.chromium.org/viewvc/chrome/trunk/src/sync/notifier/invalidation_notifier_unittest.cc?r1=150665&r2=150664&pathrev=150665
   M http://src.chromium.org/viewvc/chrome/trunk/src/sync/notifier/invalidation_notifier.cc?r1=150665&r2=150664&pathrev=150665
   A http://src.chromium.org/viewvc/chrome/trunk/src/sync/notifier/fake_invalidation_state_tracker.cc?r1=150665&r2=150664&pathrev=150665
   M http://src.chromium.org/viewvc/chrome/trunk/src/sync/notifier/registration_manager_unittest.cc?r1=150665&r2=150664&pathrev=150665
   A http://src.chromium.org/viewvc/chrome/trunk/src/sync/notifier/fake_invalidation_state_tracker.h?r1=150665&r2=150664&pathrev=150665
   M http://src.chromium.org/viewvc/chrome/trunk/src/sync/notifier/registration_manager.cc?r1=150665&r2=150664&pathrev=150665
   D http://src.chromium.org/viewvc/chrome/trunk/src/sync/notifier/mock_invalidation_state_tracker.cc?r1=150665&r2=150664&pathrev=150665
   M http://src.chromium.org/viewvc/chrome/trunk/src/sync/notifier/registration_manager.h?r1=150665&r2=150664&pathrev=150665

[Sync] Rework unit tests for ChromeInvalidationClient

In particular, add unit tests that would have caught  bug 139424 .

Dep-inject InvalidationClient into ChromeInvalidationClient.

Use the function name 'UpdateRegisteredIds' consistently.

Replace some mocks with fakes.

BUG= 139424 


Review URL: https://chromiumcodereview.appspot.com/10827133
------------------------------------------------------------------------
Comment 26 by tim@chromium.org, Aug 13 2012
Labels: -Feature-Sync Feature-Invalidations
Comment 27 by tim@chromium.org, Aug 13 2012
Labels: Feature-Sync
Status: Verified
I think this is fixed and verified.
Status: Started
I have a suspicion that this is still happening.  However, I tried on Linux stable (M21) and beta (M22) and the registrations are going through.  For posterity, here's the command:

bash -c '(sleep 5; kill $$) & exec /usr/bin/google-chrome --enable-logging=stderr --vmodule=chrome_system_resources=1' 2>&1 | grep -c 'invalidation-client-impl.*Register'

I will try Windows/Mac next.
Checked Mac on all channels (Stable, Beta, Dev, Canary) and it works fine.
Checked Windows on Stable + Canary and it works fine.

For posterity, the command for XP:

$t = 10000; $dir = 'C:\Documents and Settings\akalin\Local Settings\Application Data\Google\Chrome'; $l = "$dir\User Data\chrome_debug.log"; $e = "$dir\Application\chrome.exe"; if (Test-Path $l) { rm $l }; $p = Start-Process -FilePath $e -ArgumentList '--enable-logging', '--vmodule=chrome_system_resources=1' -PassThru; if (! $p.WaitForExit($t) ) { $p.kill() }; cat $l | where {$_ -match 'invalidation-client-impl.*Register'} | Measure-Object -Line

For canary:

$t = 10000; $dir = 'C:\Documents and Settings\akalin\Local Settings\Application Data\Google\Chrome SxS'; $l = "$dir\User Data\chrome_debug.log"; $e = "$dir\Application\chrome.exe"; if (Test-Path $l) { rm $l }; $p = Start-Process -FilePath $e -ArgumentList '--enable-logging', '--vmodule=chrome_system_resources=1' -PassThru; if (! $p.WaitForExit($t) ) { $p.kill() }; cat $l | where {$_ -match 'invalidation-client-impl.*Register'} | Measure-Object -Line
Looks like no evidence that this bug is back. (Although the same symptoms may be present)
Status: Verified
Project Member Comment 34 by bugdroid1@chromium.org, Oct 14 2012
Labels: Restrict-AddIssueComment-Commit
This issue has been closed for some time. No one will pay attention to new comments.
If you are seeing this bug or have new data, please click New Issue to start a new bug.
Project Member Comment 35 by bugdroid1@chromium.org, Mar 10 2013
Labels: -Area-Internals -Mstone-21 -Feature-Sync Cr-Services-Sync M-21 Cr-Internals
Project Member Comment 36 by bugdroid1@chromium.org, Mar 14 2013
Labels: -Restrict-AddIssueComment-Commit Restrict-AddIssueComment-EditIssue
Sign in to add a comment