New issue
Advanced search Search tips

Issue 923716 link

Starred by 1 user

Issue metadata

Status: Started
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Bug



Sign in to add a comment

DCHECK failure for AccountReconcilor

Project Member Reported by sinhak@chromium.org, Jan 20 (2 days ago)

Issue description

Chrome Version: M73 ToT
OS: Chrome OS

What steps will reproduce the problem?
(1) Get a fresh, unenrolled device with a debug build and enable crOS Account Manager device-wide with the flag
--enable-features=ChromeOSAccountManager
(2) Try signing-in (creating a new user) on the Chromebook.

What is the expected result?
Sign-in should succeed.

What happens instead?
DCHECK failure at https://cs.chromium.org/chromium/src/components/signin/core/browser/account_reconcilor.cc?l=658&rcl=c3afd9db0cadc4cdd185fa7afef5d16f1d2eb6d9

 

Comment 1 by sinhak@chromium.org, Jan 20 (2 days ago)

Root cause analysis:

The root cause is a race between AccountManagerMigrator enabling account reconciliation (https://cs.chromium.org/chromium/src/chrome/browser/chromeos/account_manager/account_manager_migrator.cc?l=374&rcl=797b2134e2eddbd219df5c20346d4ca2db8b81f9) and OAuth2LoginVerifier trying to reconcile accounts too (https://cs.chromium.org/chromium/src/chrome/browser/chromeos/login/signin/oauth2_login_verifier.cc?l=48&rcl=797b2134e2eddbd219df5c20346d4ca2db8b81f9).

AccountReconcilor rightfully assumes that if mirror account reconciliation is enabled, it is the only entity that can be messing with user cookies and its implementation (https://cs.chromium.org/chromium/src/components/signin/core/browser/account_reconcilor.cc?l=530&rcl=797b2134e2eddbd219df5c20346d4ca2db8b81f9) does not check for reentrancy caused by another entity trying to reconcile accounts. This was not a problem before because mirror account reconciliation was not available on Chrome OS and OAuth2LoginVerifier had nothing to race against.


Temporary Solution:
AccountManagerMigrator should use the correct API, AccountReconcilor::EnableReconcile (https://cs.chromium.org/chromium/src/components/signin/core/browser/account_reconcilor.cc?l=237&rcl=797b2134e2eddbd219df5c20346d4ca2db8b81f9) for enabling reconciliation. That API is already re-entrant and safe to call multiple times.


A more thorough can later be performed about the very existence of race between OAuth2LoginVerifier and AccountReconcilor, i.e. 2 entities trying to fix the user's cookies at the same time.

Comment 2 by droger@chromium.org, Yesterday (45 hours ago)

"assumes that if mirror account reconciliation is enabled, it is the only entity that can be messing with user cookies"

I don't think AccountReconcilor assumes this. I'm trying to understand why exactly we are hitting the DCHECK.

Comment 3 by sinhak@chromium.org, Yesterday (44 hours ago)

droger@: Havent fully grokked the reconcilor's yet but basically this is the flow:

- Reconciliation starts and its preconditions are satisfied.
- Flow of control reaches |FinishReconcile| and |add_to_cookie_| gets populated. In parallel, OAuth2LoginVerifier actually has an in-flight request for mending the cookie for the Primary/Device account and is _usually_ at the uber token fetching stage.
- AccountReconcilor too tries to /Merge the session but its uber token fetch but is kept waiting (https://cs.chromium.org/chromium/src/components/signin/core/browser/gaia_cookie_manager_service.cc?l=514&rcl=58acb7411fa06ff319f01e21fdadd6b3ea538cca) (usually).
- Not 100% sure but _some_ /ListAccounts succeeds and AccountReconcilor::OnGaiaAccountsInCookieUpdated gets called (https://cs.chromium.org/chromium/src/components/signin/core/browser/account_reconcilor.cc?l=530&rcl=58acb7411fa06ff319f01e21fdadd6b3ea538cca), _while_ reconciliation is still is progress (this led me to believe that the reconcilor does not seem to be re-entrant wrt other entities messing with user cookies).
- This leads to FinishReconcile being invoked again and |add_to_cookie_| DCHECK gets hit.

Let me know.

Comment 4 by sinhak@chromium.org, Yesterday (44 hours ago)

*"Havent fully grokked the reconcilor's /invariants/ yet" :(

Comment 5 by droger@chromium.org, Yesterday (43 hours ago)

Components: Services>SignIn
Thanks.

That sounds like a bug in the reconcilor, I would think that fixing the reconcilor is better than trying a workaround, especially if we don't completely understand the bug (because then we can't be sure that the workaround works in all the cases).

I can try to look into this. I haven't tried to repro on ChromeOS yet, I have only looked at the code. I don't completely understand how it is possible that OnGaiaAccountsInCookieUpdated is called during the reconcile.

As you mentioned, OnGaiaAccountsInCookieUpdated is called when ListAccounts completes. However the calls to GaiaCookieManagerService are processed sequentially in a FIFO queue, so they should not be able to execute in parallel.

There is also a mechanism that prevents having two ListAccounts requests in the queue. Since the reconcilor internally does a ListAccount, it should guarantee that there is no other ListAccount request in flight when FinishReconcile is called.

Let me try to dig further.

Comment 6 by droger@chromium.org, Yesterday (43 hours ago)

It looks like I can repro with a ChromeOS build on linux.

At startup I hit the debug. Here is the log with AccountReconcilor logs enabled:

[72391:72391:0121/115853.145810:VERBOSE1:account_reconcilor.cc(221)] AccountReconcilor::Initialize
[72391:72391:0121/115853.145829:VERBOSE1:account_reconcilor.cc(303)] AccountReconcilor::RegisterWithCookieManagerService
[72391:72391:0121/115853.145840:VERBOSE1:account_reconcilor.cc(261)] AccountReconcilor::RegisterWithContentSettings
[72391:72391:0121/115853.145852:VERBOSE1:account_reconcilor.cc(282)] AccountReconcilor::RegisterWithIdentityManager
[72391:72391:0121/115853.145879:VERBOSE1:account_reconcilor.cc(534)] AccountReconcilor::OnGaiaAccountsInCookieUpdated: CookieJar 0 accounts, Reconcilor's state is 1, Error was 
[72391:72391:0121/115853.145900:VERBOSE1:account_reconcilor.cc(640)] AccountReconcilor::ValidateAccountsFromTokenService: Chrome 1 accounts
[72391:72391:0121/115853.145910:VERBOSE1:account_reconcilor.cc(657)] AccountReconcilor::FinishReconcile
[72391:72391:0121/115853.145921:VERBOSE1:account_reconcilor.cc(400)] AccountReconcilor::PerformMergeAction: totosignin@gmail.com
[72391:72391:0121/115853.145990:VERBOSE1:account_reconcilor.cc(534)] AccountReconcilor::OnGaiaAccountsInCookieUpdated: CookieJar 0 accounts, Reconcilor's state is 1, Error was 
[72391:72391:0121/115853.146007:VERBOSE1:account_reconcilor.cc(640)] AccountReconcilor::ValidateAccountsFromTokenService: Chrome 1 accounts
[72391:72391:0121/115853.146017:VERBOSE1:account_reconcilor.cc(657)] AccountReconcilor::FinishReconcile
[72391:72391:0121/115853.146027:FATAL:account_reconcilor.cc(658)] Check failed: add_to_cookie_.empty(). 
#0 0x7f59101fbeef base::debug::StackTrace::StackTrace()
#1 0x7f591012a16a logging::LogMessage::~LogMessage()
#2 0x5618af5ff027 AccountReconcilor::FinishReconcile()
#3 0x5618af5fe8f2 AccountReconcilor::OnGaiaAccountsInCookieUpdated()
#4 0x5618af613848 GaiaCookieManagerService::OnListAccountsSuccess()

Comment 7 by droger@chromium.org, Yesterday (39 hours ago)

Owner: droger@chromium.org
I'm working on a fix

Comment 8 by droger@chromium.org, Yesterday (36 hours ago)

Cc: msarda@chromium.org
To recap:
The reconcilor currently wants to do all the reconciliation in a "atomic" way with respect to the GCMS:
1) it posts all its operations at the same time, so that nobody else hase time to make GCMS calls in between them
2) it assumes that when OnGaiaAccountsInCookieUpdated is called, then there is no other call to list account in flight, and thus OnGaiaAccountsInCookieUpdated is not going to be called before the reconciliation is finished.


It turns out that the assumption 2 is wrong: in a specific edge case (when the AccountReconcilor is initialized from within a OnGaiaAccountsInCookieUpdated notification by another observer), there is an additional call to OnGaiaAccountsInCookieUpdated. This triggers a DCHECK.

I discussed a bit with msarda about the best way to fix this, in particular:
a) should we remove some of the assumptions that the reconcilor makes
b) or should we change GCMS so that it actually works as assumed

We decided to go with a)


Sign in to add a comment