Issue metadata
Sign in to add a comment
|
Smart Lock not working after update to 70.0.3538.22 (beta)
Reported by
james.ce...@gmail.com,
Sep 21
|
||||||||||||||||||||||
Issue descriptionUserAgent: Mozilla/5.0 (X11; CrOS x86_64 11021.19.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.22 Safari/537.36 Platform: 11021.19.0 (Official Build) beta-channel caroline Example URL: Steps to reproduce the problem: 1. Enable Bluetooth on Chromebook and phone 2. Set up Smartlock in chromebook settings 3. Receive notification on phone that Smartlock is successfully set up 4. In dialog box on chromebook with heading "Smart Lock is All Set" click the blue button that says, "Try Smart Lock" What is the expected behavior? Lock screen should appear, and immediately allow the chromebook to be unlocked using the smartlock feature. Chromebook should detect phone and advise user to either click their profile picture, bring the phone closer, or unlock the phone. At least some indication should be given that the phone was detected. What went wrong? All I get is the plain lock screen with no indication that the phone was detected or that the smartlock feature is even enabled. Tried all of the following without effect: - Bringing the phone closer or farther - Restarting Bluetooth on both phone and chromebook - Restarting the chromebook - Repeating the entire procedure several times Did this work before? Yes 69.0.3497.73 (worked with smartlock, but had other startup crash issues) - more likely was .87 or .95 Chrome version: 70.0.3538.22 Channel: beta OS Version: 11021.19.0 Flash Version: 31.0.0.113 It is understood that I am on beta, and that this update was released yesterday. I just want to bring it to someone's attention, since there are likely fewer other users of this older hardware that might also be experiencing this. Thanks for looking into it.
,
Sep 25
,
Sep 25
,
Sep 25
We've received 9 users reports so far that Smart Lock is not working on M70 Beta, from a variety of release boards. Some users report that they are told that Smart Lock is done, but it's still not working. Sample User Feedback reports (with logs): https://listnr.corp.google.com/report/85676678569 https://listnr.corp.google.com/report/85675761607 https://listnr.corp.google.com/report/85675487168 https://listnr.corp.google.com/report/85674762490 https://listnr.corp.google.com/report/85674180810 https://listnr.corp.google.com/report/85673986404 https://listnr.corp.google.com/report/85673941153
,
Sep 26
,
Sep 27
,
Sep 27
Thanks for the report, James! Do you mind checking which version of Google Play Services you have on your phone? You can do so by going here on your phone: https://play.google.com/store/apps/details?id=com.google.android.gms. Then click "Read More" by the app description and you should see a version number somewhere towards the bottom of that page.
,
Sep 27
Ryan, I looked at the log in the first feedback report in comment 4 and this line stands out: [1246:1246:0921/182918.871945:ERROR:easy_unlock_create_keys_operation.cc(371)] Easy unlock failed to create key, code=2
,
Sep 28
,
Sep 28
Sure thing. I'm running GPS 13.2.80 as of Sept 6th on a well-aged stock Nexus 6 (Android 7.1.1 build N8I11F).
,
Sep 28
At the risk of sounding false-alarmist, there is new information that I hope helps you find the cause of the problem: While still on ChromeOS 70.0.3538.22, Smartlock began working again between the Nexus 6 and the Chromebook Pro. It was after three or four more restarts, after five or six days had passed. No additional smartlock setup was necessary. A startup notification said that "Smartlock was now enabled, and would be usable on the next restart." That message had not appeared before. Smartlock has been working since (on the same ChromeOS build). New options have appeared in the Screen Lock Settings (see attachment) that confirm that smartlock is now enabled. I mention this because immediately following the ChromeOS update when smartlock was NOT working for me, this section of the settings page had only one option: "Set up Smart Lock". The passage of time and restarts suggests it may have been a cache issue that needed clearing for everything to play nice. However, a false confirmation was still shown to the user, giving the impression that the smartlock setup process was successful.
,
Sep 28
Thanks, James! That's useful info and I'm glad that Smartlock is back to working again for you. However, I agree that it still seems like something isn't right here. It sounds to me like there was some state that didn't fully sync between the two devices for one reason or another. I'll leave it to Ryan to investigate further and triage this appropriately.
,
Sep 28
Investigating.
,
Oct 1
,
Oct 1
[Sidenote: It's pretty difficult to read through the feedback reports -- I suspect that a lot of logs are being stripped out. For example, take https://listnr.corp.google.com/report/85674180810. It has a lot of logs from bluetooth_low_energy_connection_finder.cc (this is only used by Smart Lock), but no logs from other classes in Smart Lock that I would expect. Am I missing something?] That being said, https://listnr.corp.google.com/report/85676678569, https://listnr.corp.google.com/report/85674180810, and https://listnr.corp.google.com/report/85673986404 all have the error message "ERROR:easy_unlock_create_keys_operation.cc(371)] Easy unlock failed to create key, code=2" error that Jeremy alluded to. These 3 occurred across a variety of boards, including Pixelbook. It could be that Smart Lock is failing to persist device information, causing it to not even show up on the lock screen, as users are reporting. Error code=2 in this case refers to cryptohome::MountError::MOUNT_ERROR_KEY_FAILURE in dbus [1]. I know we've saw cryptohome persistence issues a month or so ago -- this could be related. I will continue to investigate if this is either a Smart Lock or dbus/cryptohome bug. 1) https://cs.chromium.org/chromium/src/third_party/cros_system_api/dbus/cryptohome/dbus-constants.h?q=cryptohome::MountError&sq=package:chromium&g=0&l=153
,
Oct 1
Looking more at feedback report https://listnr.corp.google.com/report/85676678569. Its logs also include: [1246:1246:0921/182839.433158:ERROR:extended_authenticator_impl.cc(308)] Supervised user cryptohome error, code: 2 This is same error that Smart Lock is hitting. Further down, you can observe dozens of instances of the following errors: 2018-09-21T18:19:07.012796-07:00 ERR cryptohomed[1393]: TPM error 0x21 (Decryption error): Error calling Tspi_Data_Unbind 2018-09-21T18:19:07.014435-07:00 ERR cryptohomed[1393]: TPM error 0x21 (Decryption error): Retrying will not help. 2018-09-21T18:19:07.014564-07:00 ERR cryptohomed[1393]: The TPM failed to unwrap the intermediate key with the supplied credentials ... 2018-09-21T18:19:07.582589-07:00 ERR cryptohomed[1393]: TPM error 0x21 (Decryption error): Error calling Tspi_Data_Unbind 2018-09-21T18:19:07.582756-07:00 ERR cryptohomed[1393]: TPM error 0x21 (Decryption error): Retrying will not help. 2018-09-21T18:19:07.582809-07:00 ERR cryptohomed[1393]: The TPM failed to unwrap the intermediate key with the supplied credentials 2018-09-21T18:19:07.582924-07:00 ERR cryptohomed[1393]: Failed to decrypt any keysets for abd5a2707ffef6938a0ed7caab52a3b12fd50335: mount error 2, crypto error 8 2018-09-21T18:19:07.582960-07:00 WARNING cryptohomed[1393]: AddKeyset: invalid authentication provided I can't reproduce this on my local test machine when I flash it to 11021.19.0 70.0.3538.22. Perhaps something changed in cryptohome that causes issues when migrating from M69 to M70? James's description of a resolution in c#11 makes it seems like cryptohome possibly got itself out of a bad state eventually once it wasn't working with an old stored state from M69. Because both extended_authenticator_impl.cc and easy_unlock_create_keys_operation.cc are hitting this issue independently, I believe this is an issue in cryptohome itself, and not the clients of cryptohome. CCing a few people: +derat@ and rkc@: it seems you both have some context with cryptohome and the TPM. If you have any insight, or know who should look at this, any help would be appreciated! +hashimoto@: it seems you've done a fair amount of work with Cryptohome. Do you have any insight here? +jdufault@: Owner of extended_authenticator_impl.cc.
,
Oct 1
(forgot to CC derat@)
,
Oct 1
+apronin@ for cryptohome/TPM
,
Oct 1
Re #16: All cryptohomed says at this point is that some keyset file cannot be decrypted with the combination of supplied credentials (then the error is somewhere in upper SmartLock layers) and cryptohome key (thoughm no attempts to load the key, so it must be loaded fine; but maybe the smart unlock keyset survived changing the cryptohome key somehow?) When we authorize for SmartUnlock, do we use an empty or non-empty keyset label? If non-empty: most likely something is wrong with the credentials (stale?). If empty: I'd say trying to authorize against other keysets is the most likely cause for these lines in the log. Any chance the same errors are present on M69, where everything works? It is possible that we are attempting to use SmartUnlock credentials against all existing keysets, incl the password keyset, instead of just against the SmartUnlock keyset. Then, yes, we'd get those errors. Just in case, is it possible to get full logs for the device from #16? The syslog messages from boot time are not captured in the feedback report, and there may be something interesting there, even though unlikely (from the TPM angle): no issues with loading cryptohome key, no errors from TPM commands.
,
Oct 1
Thanks for the quick response Andrey! I didn't personally write this code, so I'm not 100% familiar with it, but re: your label question, it looks like Smart Lock is using an empty keyset label here: [1]. Do you recommend that we make sure a label is used here? (will that prevent migration from older versions?) Unfortunately, no, we can't get full logs in #16 (I would also appreciate having them). Andrey, does #15 look like a decryption issue to you as well? This code is adding cryptohome keys, using AddKeyEx() [1]. Does it concern you that both this call and the similar call to CheckKeyEx() in extended_authenticator_impl.cc [2] are failing for the same reason? (error code 2, MOUNT_ERROR_KEY_FAILURE). 1) https://cs.chromium.org/chromium/src/chrome/browser/chromeos/login/easy_unlock/easy_unlock_create_keys_operation.cc?q=easy_unlock_create_keys_operation.cc&sq=package:chromium&dr&l=356 2) https://cs.chromium.org/chromium/src/chromeos/login/auth/extended_authenticator_impl.cc?q=extended_authenticator_impl.cc&dr&l=308
,
Oct 2
I was investigating our metrics -- check out SmartLock.EnabledState [1]. The "Disabled" count spiked dramatically around September 20 (the spike has since dropped back to normal as of today). All of these feedback reports were filed on the 20th or 21st, so I believe this spike exactly corresponds to this bug. The "Disabled" event is emitted when the EasyUnlockEnabledStateSet pref is |true|, but the CryptAuthDeviceManager returns an empty set of unlock keys [2]. For some reason, CryptAuthDeviceManager suddenly thought the user didn't have a Smart Lock host device. Notice as well (in the uma/ link) that 70.0.3538.22 was actually released 2 days after this spike, while it was falling. That is, this bug seems orthogonal to client update. All of this suggests to me that this bug may actually be a server issue. Jeremy, did CryptAuth experience any recent issues that would cause it return incorrect device metadata? (or none at all?) (Andrey: Apologies for a comment unrelated to what I asked you -- I'm still curious about my questions in #20) 1) https://uma.googleplex.com/p/chrome/timeline_v2?sid=689f148b1a113a8503363e8d502224c9 2) https://cs.chromium.org/chromium/src/chrome/browser/chromeos/login/easy_unlock/easy_unlock_service_regular.cc?q=easyunlockserviceregular&sq=package:chromium&dr=CSs&l=183
,
Oct 2
Nice digging, Ryan! I've checked a few things here: There wasn't a gmscore push or flag flip around that time. Max did land cl/213649156 in cryptauth on the 19th, but I can't tell whether that would have gone to prod already by the 20th (seems unlikely). Looking at the cryptauth dashboards for devicesync (http://shortn/_YfOhq1K67P), it does seem like there was a slight bump in usage around then, but nothing crazy. Most damning, however,is that the spike only seemed to happen on the Beta channel. If this was an issue outside of Chrome, why wouldn't it affect dev and stable? Are you sure that 70 didn't go out until a couple days after this? I know that uma shows that, but I see some emails referencing the 70 beta cut on the 20th. Is there somewhere you can check this for sure? I also wonder if our create dogfood was announced right around then... Another note here is that it looks like the stable report count for this state is rising over the last week or so. We should see if that's a concern.
,
Oct 2
,
Oct 2
Jeremy, to your last point "it looks like the stable report count for this state is rising over the last week or so" -- I don't think this is a concern. The total count for this histogram on stable-channel has been rising (Enabled, Disabled, and Unset have all been rising). Anyway, great point Jeremy re: this only seems to affect beta-channel. After giving this more thought, I think this is a client issue, not a server issue as I suspected. What originally threw me off was that 70.0.3538.22 in the UMA timeline was listed as being after the spike (Sep 22), but after examining goldeneye, it seems that RC was actually released Sep 20, the time of the spike. (We should followup with the UMA team to clarify what is causing this discrepancy). Check this out: when the UMA graph is extended to 90 days in the past [1], you can see a similar spike in dev on August 23. Something seems to have changed migrating from 69 to 70, definitely. It seems that the issue at least resolves itself, but I want to try to resolve this quickly so we can prevent this temporary breakage from affecting stable-channel in the future. I'm digging now. My first suspicion is that something may be wrong with how CryptAuthDeviceManager handles pref (de)serialization with respect to the new feature bits. 1) https://uma.googleplex.com/p/chrome/timeline_v2?sid=03c9f90af5227fbbb51c37359dad0df0
,
Oct 3
Ok, I think I found the cause of this issue :)
My suspicion about CryptAuthDeviceManager being the culprit was correct. I flashed my test device to R69-10895.21.0, set up Smart Lock and confirmed it worked, and then flashed it to HEAD. This allowed me to simulate a user migrating from 69 by preserving the user account with prefs.
Once I had done that, I printed out the saved prefs that pointed to my host device. Its software_features dictionary read:
"software_features": {
"3": 2, (EASY_UNLOCK_HOST == kEnabled)
"5": 1 (MAGIC_TETHER_HOST == kSupported)
}
I was pretty confused, because everything looked in order, but CryptAuthDeviceManager::GetUnlockKeys() still didn't think an unlock key was present. But then I tried to compare these old prefs to new prefs -- new prefs represent the dictionary like so:
"software_features": {
"easyUnlockHost": 2,
"magicTetherHost": 1
}
The culprit is this CL [1]. When it changed ExternalDeviceInfos vectors to contain strings instead of ints, it did not correctly migrate previous pref values from int to string inside CryptAuthDeviceManagerImpl::AddSoftwareFeaturesToExternalDevice() [2].
I'm working on the fix now. I will manually test my fix using the same repro steps I described above.
1) https://chromium-review.googlesource.com/c/chromium/src/+/1145588
2) https://cs.chromium.org/chromium/src/components/cryptauth/cryptauth_device_manager_impl.cc?q=AddSoftwareFeaturesToExternalDevice&sq=package:chromium&g=0&l=297
,
Oct 3
Very nicely done, Ryan!! Bit by Apiary string formatting once again...
,
Oct 3
Wow, fell out of discussion for a while (I have an excuse I'm this week's sheriff ;)), and by the time I return the investigation is complete. Very nice done indeed! Just in case: any remaining questions I need to address? For this: > Do you recommend that we make sure a label is used here? (will that prevent migration from older versions?) Yes, in general, for all auth attempts I recommend specifying the appropriate keyset label (if it is fixed) to avoid walking through all existing user keysets and trying the provided credentials against each one until it succeeds. So, if some known fixed label was always used for SmartLock keys in AddKeyEx, then no migration is needed, just use the same label when authorizing. If it's possible that SmartLock keys were created with different labels in the past, then we can stay on legacy behavior. But in any case, that optimization is unrelated to this bug.
,
Oct 3
Thank you Andrey (and others) for helping out here even though the bug was unrelated to you. I've opened crbug.com/891557 to keep track of the label issue. Other than the label issue, Andrey, I don't believe I have any more questions for you specifically. Thanks!
,
Oct 3
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/d7a68aaef311d65588fcda5407c351dcb0d6a2c8 commit d7a68aaef311d65588fcda5407c351dcb0d6a2c8 Author: Ryan Hansberry <hansberry@chromium.org> Date: Wed Oct 03 17:13:27 2018 CryptAuthDeviceManager: correctly migrate old SoftwareFeature prefs. When SoftwareFeatures were changed to be stored in prefs as strings instead of as ints [1], CryptAuthDeviceManagerImpl failed to also migrate old int prefs to the new string representation when the prefs were desrialized. This caused all devices migrating from M69 to a newer release to believe none of their synced devices had any SoftwareFeatures until another sync occurred. 1) https://chromium-review.googlesource.com/c/chromium/src/+/1145588 Bug: 888031 Change-Id: Icd666106d6ebd4db5620cfab0dac459ef67df23e Reviewed-on: https://chromium-review.googlesource.com/1258515 Reviewed-by: Kyle Horimoto <khorimoto@chromium.org> Commit-Queue: Ryan Hansberry <hansberry@chromium.org> Cr-Commit-Position: refs/heads/master@{#596263} [modify] https://crrev.com/d7a68aaef311d65588fcda5407c351dcb0d6a2c8/components/cryptauth/cryptauth_device_manager_impl.cc [modify] https://crrev.com/d7a68aaef311d65588fcda5407c351dcb0d6a2c8/components/cryptauth/cryptauth_device_manager_impl_unittest.cc
,
Oct 3
Requesting merge into M70. This needs to land in M70 before it hits stable (without it, Smart Lock will be broken).
,
Oct 3
,
Oct 3
This bug requires manual review: We are only 12 days from stable. Please contact the milestone owner if you have questions. Owners: benmason@(Android), kariahda@(iOS), geohsu@(ChromeOS), abdulsyed@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Oct 3
,
Oct 3
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/015c3485f6e7300e19bd3288861f5ea6dc6b7ca1 Commit: 015c3485f6e7300e19bd3288861f5ea6dc6b7ca1 Author: hansberry@chromium.org Commiter: hansberry@chromium.org Date: 2018-10-03 19:11:40 +0000 UTC CryptAuthDeviceManager: correctly migrate old SoftwareFeature prefs. When SoftwareFeatures were changed to be stored in prefs as strings instead of as ints [1], CryptAuthDeviceManagerImpl failed to also migrate old int prefs to the new string representation when the prefs were desrialized. This caused all devices migrating from M69 to a newer release to believe none of their synced devices had any SoftwareFeatures until another sync occurred. 1) https://chromium-review.googlesource.com/c/chromium/src/+/1145588 Bug: 888031 Change-Id: Icd666106d6ebd4db5620cfab0dac459ef67df23e Reviewed-on: https://chromium-review.googlesource.com/1258515 Reviewed-by: Kyle Horimoto <khorimoto@chromium.org> Commit-Queue: Ryan Hansberry <hansberry@chromium.org> Cr-Original-Commit-Position: refs/heads/master@{#596263}(cherry picked from commit d7a68aaef311d65588fcda5407c351dcb0d6a2c8) Reviewed-on: https://chromium-review.googlesource.com/c/1259488 Reviewed-by: Ryan Hansberry <hansberry@chromium.org> Cr-Commit-Position: refs/branch-heads/3538@{#840} Cr-Branched-From: 79f7c91a2b2a2932cd447fa6f865cb6662fa8fa6-refs/heads/master@{#587811}
,
Oct 3
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/015c3485f6e7300e19bd3288861f5ea6dc6b7ca1 commit 015c3485f6e7300e19bd3288861f5ea6dc6b7ca1 Author: Ryan Hansberry <hansberry@chromium.org> Date: Wed Oct 03 19:11:40 2018 CryptAuthDeviceManager: correctly migrate old SoftwareFeature prefs. When SoftwareFeatures were changed to be stored in prefs as strings instead of as ints [1], CryptAuthDeviceManagerImpl failed to also migrate old int prefs to the new string representation when the prefs were desrialized. This caused all devices migrating from M69 to a newer release to believe none of their synced devices had any SoftwareFeatures until another sync occurred. 1) https://chromium-review.googlesource.com/c/chromium/src/+/1145588 Bug: 888031 Change-Id: Icd666106d6ebd4db5620cfab0dac459ef67df23e Reviewed-on: https://chromium-review.googlesource.com/1258515 Reviewed-by: Kyle Horimoto <khorimoto@chromium.org> Commit-Queue: Ryan Hansberry <hansberry@chromium.org> Cr-Original-Commit-Position: refs/heads/master@{#596263}(cherry picked from commit d7a68aaef311d65588fcda5407c351dcb0d6a2c8) Reviewed-on: https://chromium-review.googlesource.com/c/1259488 Reviewed-by: Ryan Hansberry <hansberry@chromium.org> Cr-Commit-Position: refs/branch-heads/3538@{#840} Cr-Branched-From: 79f7c91a2b2a2932cd447fa6f865cb6662fa8fa6-refs/heads/master@{#587811} [modify] https://crrev.com/015c3485f6e7300e19bd3288861f5ea6dc6b7ca1/components/cryptauth/cryptauth_device_manager_impl.cc [modify] https://crrev.com/015c3485f6e7300e19bd3288861f5ea6dc6b7ca1/components/cryptauth/cryptauth_device_manager_impl_unittest.cc
,
Oct 3
|
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by mattm@chromium.org
, Sep 21