Out-of-sync of discovery session in Chrome/Instant Tether |
|||||||||||
Issue descriptionFrom a specific case in b/78181184, a bluetoothd crash may cause out-of-sync of discovery session in Chrome/Instant Tether, apparent from the logs mentioning that ble_scanner_impl.cc keeps trying to stop a discovery session but that discovery session is not active. At this moment it's not yet clear whether the out-of-sync is in Chrome Bluetooth API or in Instant Tether.
,
May 22 2018
,
May 22 2018
,
May 24 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/acd69e7c2897409ecc468f4a8cd5a6ec2661dfc9 commit acd69e7c2897409ecc468f4a8cd5a6ec2661dfc9 Author: Sonny Sasaka <sonnysasaka@chromium.org> Date: Thu May 24 01:06:36 2018 device/bluetooth: Make sure to clear discovery sessions when adapter goes away When Bluetooth adapter goes away, this could be caused by BlueZ crash or chip error. In any case, we should make sure to clear all discovery sessions, otherwise out-of-sync about discovery sessions can happen between Chrome and BlueZ. BUG= 837656 Change-Id: Id1956039a94da32fe95a0d2c023eb97eef8d99fa Reviewed-on: https://chromium-review.googlesource.com/1069934 Commit-Queue: Sonny Sasaka <sonnysasaka@chromium.org> Reviewed-by: Miao-chen Chou <mcchou@chromium.org> Cr-Commit-Position: refs/heads/master@{#561347} [modify] https://crrev.com/acd69e7c2897409ecc468f4a8cd5a6ec2661dfc9/device/bluetooth/bluez/bluetooth_adapter_bluez.cc [modify] https://crrev.com/acd69e7c2897409ecc468f4a8cd5a6ec2661dfc9/device/bluetooth/bluez/bluetooth_bluez_unittest.cc
,
May 24 2018
Sonny, should this be merged to M-67?
,
May 24 2018
Yes, filed a merge request.
,
May 24 2018
This bug requires manual review: We are only 4 days from stable. Please contact the milestone owner if you have questions. Owners: cmasso@(Android), cmasso@(iOS), kbleicher@(ChromeOS), govind@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
May 24 2018
Pending update from Sonny re: testing
,
May 24 2018
Testing result: Kyle did a stress test involving continuous scanning for hours. The tested subject is eve M67-10575.40.0 (which still had a lot of bluetoothd crashes) with the fix patch applied. This was intentionally chosen to see whether the fix is resilient of these crashes. During about 3 hours of continuous scan, there were 10 crashes of bluetoothd but none of them led to the out-of-sync state this bug is about. We stopped the test when it hit other issues outside the scope of this bug. logs from Kyle's testing: https://drive.google.com/file/d/1G6Fi7-ZkU5k0RZoa5WoWREcUkJ2aZkU7/view So, from the testing it can be seen that the fix successfully mitigate the out-of-sync discovery session bug.
,
May 25 2018
Thanks. Also, I didn't note where this bug derived. Is this a M67 regression or has it existed longer than that? Also, if we have all of the other crashes mentioned, will this fix make a notable difference for the users?
,
May 25 2018
It has existed since August 2017, but the bug never manifested because other factors didn't quite permit it to happen likely. The crash bugs of bluetoothd (Bluetooth daemon) also have been there for long, but manifested only recently possibly because changes of other factors (like Bluetooth-heavy features). There is another long-term ongoing work to fix bluetoothd crashes. This fix will make our Bluetooth stack to at least recover from those crashes, not get stuck in a bad state. Without the fix, the user notices that the Bluetooth stack is stuck in a bad state (cannot scan anymore without restarting the stack).
,
May 25 2018
Sorry, to clarify my last comment: The bug has existed since August 2017 (because that's the first time I saw the code), but could be much earlier according to the code history.
,
May 29 2018
Stable is pending so I'd rather reject in favor of M68, esp as this has been an issue for some time. Please retag the request with reasoning if there's reason otherwise. Thanks
,
May 29 2018
Hey Kevin - I still think this is worthy of merging to M-67. You are correct that the issue has been in existence for some time, but it seems that some other factors have changed in M-67 which have exposed the issue. We are still investigating *why* the rate of bluetoothd crashes has increased so rapidly in M-67, but we are definitely sure that there *has* been an increase. Please see the following metrics which indicate the severity of the issue: https://uma.googleplex.com/timeline_v2?sid=4fdca47899002f26feabdf192efe5fb3 Generally, this metric hovers around 96-99% success rate. In M-67, the success rate has plummeted to 25-30%. This means that any Chrome OS feature which utilizes Bluetooth Low Energy scans will have a ~75% chance of failing; worse yet, these issues are not recoverable and thus require a full power cycle of the machine before they are fixed. Since most users simply close their laptop lid and reopen it later (without a full shutdown-then-startup flow), this bug will leave the majority of users unable to use BLE features, such as Instant Tethering and EasyUnlock. We are currently running tests on the latest M-67 cut to confirm that the issue still exists, and we will provide test logs as soon as possible. Is there anything specifically you'd like us to do to help with your decision? Thanks!
,
May 29 2018
This bug requires manual review: Request affecting a post-stable build Please contact the milestone owner if you have questions. Owners: cmasso@(Android), cmasso@(iOS), kbleicher@(ChromeOS), govind@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
May 30 2018
Thanks for the context in #14. Assume everyone is satisfied with testing showing that this hasn't made matters worse? I'll approve if that's the case.
,
May 30 2018
Hi Kevin, we understand that it's risky to merge so we did more investigation about the severity of this issue. Taking into consideration the other fixes that has landed and improved the stability (CL 1063049, CL 1063050, CL 1063051), we see that it's not urgent to merge this particular fix to M67 so we accept the merge reject decision. Thanks for reviewing this merge request!
,
May 30 2018
Thanks again, Kevin. Sorry for the noise here :)
,
Jun 4 2018
No worries; thanks for the feedback.
,
Sep 24
|
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by sonnysasaka@chromium.org
, May 22 2018