New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 878718 link

Starred by 2 users

Issue metadata

Status: Duplicate
Merged: issue 880633
Owner:
Closed: Sep 5
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 2
Type: Bug



Sign in to add a comment

Access tokens don’t refresh

Project Member Reported by sfiera@chromium.org, Aug 29

Issue description

Chrome Version: 70.0.3534.0 (Official Build) canary (64-bit)
OS: Mac, 10.13.6 (17G65)

I upgraded Canary on Mon Aug 27 11:27:27. Since then, several of my access tokens on chrome://signin-internals have not refreshed, breaking Sync (see attached screenshot). However, some of my tokens *have* refreshed and continue to do so.


Some details from signin-internals in text form:

Basic Information
 Chrome Version       70.0.3534.0 canary	 
 Account Consistency  DICE	 
 Signin Status        Signed In

Last Signin Details
 Gaia Authentication Result	 
 RefreshToken Received       Successful (106082914502521534264)  8/28/18, 10:31:45 AM
 SigninManager Started
 SigninManager Completed     4/10/18, 11:26:57 AM                Successful


This sounds possibly similar to  crbug.com/869376 .

Separately, when Sync attempts to use the token, it reports NETWORK_CONNECTION_UNAVAILABLE instead of AUTH_ERROR, which distinguishes this from crbug.com/875315.
 
Screen Shot 2018-08-29 at 12.25.54.png
310 KB View Download
Cc: treib@chromium.org
Cc: blundell@chromium.org
Components: Services>SignIn Services>Sync
Labels: -Pri-3 Pri-2
Also, I tried running Canary with a different user-dir and didn’t run into this problem, so it’s tied to Chrome’s state, not the account. I will try to remember not to restart or upgrade until someone’s told me it’s OK.
Current state attached.

I have more tokens. Some of the expired tokens have been successfully fetched since yesterday, but then expired.
Screen Shot 2018-08-30 at 13.30.32.png
349 KB View Download
Cc: msarda@chromium.org
Owner: droger@chromium.org
Adding more signin folks.
This still looks quite similar to bug 875315. There, it was the case that token requests got wrongly canceled under some circumstances, and *some* consumers had custom retry logic that hid this problem. And the set of services with a valid token looks quite similar (gcm_network_channel, suggestions_service, extensions_identity_api), so I'm suspecting this is a similar problem.
Cc: droger@chromium.org zea@chromium.org
Owner: blundell@chromium.org
Status: Started (was: Untriaged)
Back from vacation now :).

1. sfiera@, could you provide a screenshot of sync-internals as well?

2. Wrt this bug and crbug.com/875315, there is something that I'm not understanding: Sync has custom retry logic. I detailed my investigation of the impact of  crbug.com/869376  here: https://docs.google.com/document/d/1oyzXBl4Z-PLh36br9l9T2yhsHiPnZVmY3XF-B7wI2DI/edit#heading=h.al2bfduymcp. My expectation (as I wrote up there) was that sync would *not* be affected. So either (a) there's something that I'm not yet understanding about the impact of  crbug.com/869376  (which could also imply that my fix was not complete), or (b) there's something else going on (of course, it's very suspicious that Nicolas' problem has seemed to go away once he updated to the version with the cherrypick).

I'd like to try to understand why sync seems to not be retrying and obtaining a valid access token.
Hmm. Sync is working now after the weekend, though there’s still something weird about my signin state.
signin-internals.png
333 KB View Download
sync-internals.png
550 KB View Download
Oh, and sorry I didn’t capture sync-internals in the former, bad state. My recollection is that all datatypes were listed as “Backed off”, and in the Sync Protocol Log, the client would issue a request and receive NETWORK_CONNECTION_UNAVAILABLE every ten minutes.

In particular, I don’t recall what was in the “Identity” and “Credentials” sections, but whatever it was, it wasn’t anything obviously bad [to us] as it took quite a bit more investigation before we realized the signin state was bad.
We’re back out of business!

Screenshots attached, plus the specific excerpt from “Dump status” that I assume you were looking for:
    {
      "data": [ {
        "is_valid": true,
        "stat_name": "Requested Token",
        "stat_value": "Monday, September 3, 2018 at 9:41:57 AM"
      }, {
        "is_valid": true,
        "stat_name": "Received Token",
        "stat_value": "Monday, September 3, 2018 at 9:41:57 AM"
      }, {
        "is_valid": true,
        "stat_name": "Token Request Status",
        "stat_value": "OK"
      }, {
        "is_valid": true,
        "stat_name": "Next Token Request",
        "stat_value": "not scheduled"
      }],
      "is_sensitive": false,
      "title": "Credentials"
    },
signin-internals.png
345 KB View Download
sync-internals.png
584 KB View Download
Hm, looking over the code some more, I think the NETWORK_CONNECTION_UNAVAILABLE error might be the key: It looks like on this error, we don't even get to the point of trying to acquire an access token. Which makes sense in itself, but the question is why we're getting a network error in the first place. Also weird is that gaia_account_tracker and profile_downloader seem to have the same problem.
Still, the Sync client does issue a request to the server. The attached request from chrome://net-internals corresponds to:
9/5/2018, 11:29:23 AM GetUpdates Response
Received error: NETWORK_CONNECTION_UNAVAILABLE

So the Sync client may not get to the point of requesting an access token, but it definitely gets *past* that point (and a 401, obviously).
net-internals.txt
4.3 KB View Download
The behavior of the services other than sync appears to be similar on 68, which is before  crbug.com/869376  (see attachment). So I think that we can concentrate on sync for now.


68_signin_internals.png
237 KB View Download
Wrt comment #10, it's possible that the error here isn't related to access tokens at all. If sync isn't getting to the point of making a request with the access token and having it fail due to invalid credentials, then sync won't try to request a new access token. Hence it could end up in a state where it has an expired access token but no next access token fetch scheduled.

Agreed with Marc that the pressing question is why we're getting the NETWORK_CONNECTION_UNAVAILABLE error.
(Re: comment #14, the behavior that I described is my understanding of sync's logic, at least).
It looks like NETWORK_CONNECTION_AVAILABLE is set only when the server_status is set to HttpResponse::CONNECTION_UNAVAILABLE (cf. https://cs.chromium.org/search/?q=%22NETWORK_CONNECTION_UNAVAILABLE%22&sq=package:chromium&type=cs).

In turn, HttpResponse::CONNECTION_UNAVAILABLE is set in the following places:

https://cs.chromium.org/search/?q=%22%3D+HttpResponse::CONNECTION_UNAVAILABLE%22&sq=package:chromium&type=cs

Unsurprisingly, comments on server_connection_manager.h indicate that SYNC_AUTH_ERROR should be set in response to receiving a 401 error (https://cs.chromium.org/chromium/src/components/sync/engine_impl/net/server_connection_manager.h?type=cs&q=SYNC_AUTH_ERROR&sq=package:chromium&g=0&l=49).

It's not at all obvious to me how we're getting from reception of the 401 error to setting HttpResponse::CONNECTION_UNAVAILABLE.
Marc, is it possible that https://chromium-review.googlesource.com/c/chromium/src/+/1051242 has affected the flow here in a way that tickles some weirdness? This method: https://cs.chromium.org/chromium/src/components/sync/engine_impl/sync_manager_impl.cc?sq=package:chromium&g=0&l=476 returns early if its call to SetAuthToken() returns false. SetAuthToken(), in turn, returns false if the passed-in auth token is empty: https://cs.chromium.org/chromium/src/components/sync/engine_impl/net/server_connection_manager.cc?q=SetAuthToken&sq=package:chromium&g=0&l=173. Thus, it's possible that not setting the auth token to "credentials_lost" in ProfileSyncService has some behavioral impact.
Thanks for the investigation so far!
That CL does look plausibly related, though I still don't see how the empty token would result in a CONNECTION_UNAVAILABLE error.
It looks like before that CL, we would have done a "scheduler_->OnCredentialsUpdated()" call that we're not doing anymore now, maybe that somehow got us out of the bad state again? But the connection to the connection error eludes me...
Cross-ref:  bug 880633  is almost certainly a dupe, and the investigation there has found a likely culprit.
Mergedinto: 880633
Status: Duplicate (was: Started)
Whoohoo! Going to dupe this bug into that one, where the relevant discussion is happening.
Project Member

Comment 21 by bugdroid1@chromium.org, Sep 11

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/05b9b976f63eb1f9fcca0e01c15a6eeeb719dd69

commit 05b9b976f63eb1f9fcca0e01c15a6eeeb719dd69
Author: Antonio Gomes <tonikitoo@igalia.com>
Date: Tue Sep 11 17:10:45 2018

Fix sync logic failing with repeated instance of "network connection unavailable"

This is a follow on of [1], where the sync::HttpBridge logic switched
away from URLFetcher to SimpleURLLoader, as part of the network
servicification effort.

[1] https://crrev.com/c/1174655/

It turns out that the way sync::HttpBridge interpreted the response
of an upload request changed after the migration:

By default, SimpleURLLoader does not pass a valid response body in case
of either HTTP or network errors. What is more, the new logic in
HttpBridge::OnURLLoadComplete relied precisely on null-checking the responsebody
to verify whether the upload was successful or not.

This CL changes this particular check logic so that it matches the
previous behavior: only actual network errors account for sync failures
(which map to "network connection unavailable" errors) whereas HTTP errors
(>= 400 && < 600) still account as a successful uploads.
401 specifically is handled to request expired user credentials in [2].

[2] https://cs.chromium.org/chromium/src/components/sync/engine_impl/net/sync_server_connection_manager.cc?l=84

BUG= 844968 , 878718 , 880633 

Change-Id: I9d0b93b527f103149a7c78ba405dde27cae779f2
Reviewed-on: https://chromium-review.googlesource.com/1217122
Commit-Queue: Antonio Gomes <tonikitoo@igalia.com>
Reviewed-by: Matt Menke <mmenke@chromium.org>
Reviewed-by: Marc Treib <treib@chromium.org>
Cr-Commit-Position: refs/heads/master@{#590363}
[modify] https://crrev.com/05b9b976f63eb1f9fcca0e01c15a6eeeb719dd69/components/sync/engine/net/http_bridge.cc
[modify] https://crrev.com/05b9b976f63eb1f9fcca0e01c15a6eeeb719dd69/components/sync/engine/net/http_bridge_unittest.cc

Project Member

Comment 22 by bugdroid1@chromium.org, Sep 14

Labels: merge-merged-3538
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/bb324165da2232905e108c07ef42e288db468fac

commit bb324165da2232905e108c07ef42e288db468fac
Author: Antonio Gomes <tonikitoo@igalia.com>
Date: Fri Sep 14 06:31:30 2018

Fix sync logic failing with repeated instance of "network connection unavailable"

This is a follow on of [1], where the sync::HttpBridge logic switched
away from URLFetcher to SimpleURLLoader, as part of the network
servicification effort.

[1] https://crrev.com/c/1174655/

It turns out that the way sync::HttpBridge interpreted the response
of an upload request changed after the migration:

By default, SimpleURLLoader does not pass a valid response body in case
of either HTTP or network errors. What is more, the new logic in
HttpBridge::OnURLLoadComplete relied precisely on null-checking the responsebody
to verify whether the upload was successful or not.

This CL changes this particular check logic so that it matches the
previous behavior: only actual network errors account for sync failures
(which map to "network connection unavailable" errors) whereas HTTP errors
(>= 400 && < 600) still account as a successful uploads.
401 specifically is handled to request expired user credentials in [2].

[2] https://cs.chromium.org/chromium/src/components/sync/engine_impl/net/sync_server_connection_manager.cc?l=84

BUG= 844968 , 878718 , 880633 

Change-Id: I9d0b93b527f103149a7c78ba405dde27cae779f2
Reviewed-on: https://chromium-review.googlesource.com/1217122
Commit-Queue: Antonio Gomes <tonikitoo@igalia.com>
Reviewed-by: Matt Menke <mmenke@chromium.org>
Reviewed-by: Marc Treib <treib@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#590363}(cherry picked from commit 05b9b976f63eb1f9fcca0e01c15a6eeeb719dd69)
Reviewed-on: https://chromium-review.googlesource.com/1225490
Reviewed-by: Antonio Gomes <tonikitoo@igalia.com>
Cr-Commit-Position: refs/branch-heads/3538@{#400}
Cr-Branched-From: 79f7c91a2b2a2932cd447fa6f865cb6662fa8fa6-refs/heads/master@{#587811}
[modify] https://crrev.com/bb324165da2232905e108c07ef42e288db468fac/components/sync/engine/net/http_bridge.cc
[modify] https://crrev.com/bb324165da2232905e108c07ef42e288db468fac/components/sync/engine/net/http_bridge_unittest.cc

Sign in to add a comment