New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 803677 link

Starred by 1 user

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

BluetoothRemoteGattCharacteristicBlueZ::WriteRemoteCharacteristic() discards callbacks

Project Member Reported by khorimoto@chromium.org, Jan 18 2018

Issue description

Left a long-running host scan going. Eventually, it stopped working.

Toward the end of the logs, I see things like:

[1390:1390:0118/114324.293210:INFO:ble_connection_manager.cc(229)] Register - Device ID: "CAESR...aLg==", Reason: [KeepAliveTickle]

...but no connection is ever attempted.

Still digging into this. Logs here:
https://drive.google.com/file/d/1ENQT7REBOQOEJwxVYSA2SodZWbYzvuK8/view?usp=sharing
 
Looks like authentication fails at some point, then we never unregister the device whose authentication fails. This causes it to get stuck in the device queue:

[1390:1390:0118/113639.811200:WARNING:device_to_device_authenticator.cc(198)] Authentication failed: Timed out waiting for [Responder Auth]
[1390:1390:0118/113639.811963:WARNING:secure_channel.cc(302)] Failed to authenticate connection to device with ID CAESR...aLg==
[1390:1390:0118/113639.812639:INFO:bluetooth_low_energy_weave_client_connection.cc(231)] Disconnection requested; sending "connection close" uWeave packet to {id: "CAESR...aLg==", addr: "7C:A8:45:6E:4E:59"}.
[1390:1390:0118/113639.813331:INFO:bluetooth_low_energy_weave_client_connection.cc(816)] Sending a "connection close" uWeave packet.
[1390:1390:0118/113639.814027:INFO:bluetooth_low_energy_weave_client_connection.cc(647)] Writing 3 bytes to {id: "CAESR...aLg==", addr: "7C:A8:45:6E:4E:59"}.

Then, because the message that got stuck in the queue was a TetherAvailabilityRequest, the scan was still considered active (since all connection attempts had not yet completed). Thus, when a new scan was triggered, HostScanScheduler thought that a scan was ongoing and never scheduled a new one.
Cc: mcchou@chromium.org r...@chromium.org
Writing the "connection close" packet should eventually trigger a transition to the DISCONNECTED state. However, it does not appear that we ever get either the success or error callback called when writing the packet.

These logs appear just after the packet is written:
[1390:1390:0118/113639.814072:VERBOSE1:bluetooth_remote_gatt_characteristic_bluez.cc(203)] Sending GATT characteristic write request to characteristic: /org/bluez/hci0/dev_7C_A8_45_6E_4E_59/service0028/char002c, UUID: 00000100-0004-1000-8000-001a11000101, with value: [A20000].
[1390:1390:0118/113639.841778:VERBOSE2:bluetooth_gatt_descriptor_client.cc(170)] Remote GATT descriptor removed: /org/bluez/hci0/dev_7C_A8_45_6E_4E_59/service0028/char0029/desc002b
[1390:1390:0118/113639.841826:VERBOSE2:bluetooth_remote_gatt_characteristic_bluez.cc(282)] Unknown descriptor removed: /org/bluez/hci0/dev_7C_A8_45_6E_4E_59/service0028/char0029/desc002b
[1390:1390:0118/113639.841847:VERBOSE1:bluetooth_remote_gatt_characteristic_bluez.cc(286)] Removing remote GATT descriptor from characteristic: /org/bluez/hci0/dev_7C_A8_45_6E_4E_59/service0028/char0029, UUID: 00000100-0004-1000-8000-001a11000102
[1390:1390:0118/113639.841889:VERBOSE2:bluetooth_remote_gatt_characteristic_bluez.cc(282)] Unknown descriptor removed: /org/bluez/hci0/dev_7C_A8_45_6E_4E_59/service0028/char0029/desc002b
[1390:1390:0118/113639.841963:VERBOSE2:bluetooth_gatt_characteristic_client.cc(212)] Remote GATT characteristic removed: /org/bluez/hci0/dev_7C_A8_45_6E_4E_59/service0028/char0029
[1390:1390:0118/113639.841988:VERBOSE2:bluetooth_remote_gatt_service_bluez.cc(205)] Unknown GATT characteristic removed: /org/bluez/hci0/dev_7C_A8_45_6E_4E_59/service0028/char0029
[1390:1390:0118/113639.842008:VERBOSE1:bluetooth_remote_gatt_service_bluez.cc(209)] Removing remote GATT characteristic from service: /org/bluez/hci0/dev_7C_A8_45_6E_4E_59/service0028, UUID: b3b7e28e-a000-3e17-bd86-6e97b9e28c11
[1390:1390:0118/113639.842087:VERBOSE2:bluetooth_gatt_characteristic_client.cc(212)] Remote GATT characteristic removed: /org/bluez/hci0/dev_7C_A8_45_6E_4E_59/service0028/char002c
[1390:1390:0118/113639.842109:VERBOSE2:bluetooth_remote_gatt_service_bluez.cc(205)] Unknown GATT characteristic removed: /org/bluez/hci0/dev_7C_A8_45_6E_4E_59/service0028/char002c
[1390:1390:0118/113639.842128:VERBOSE1:bluetooth_remote_gatt_service_bluez.cc(209)] Removing remote GATT characteristic from service: /org/bluez/hci0/dev_7C_A8_45_6E_4E_59/service0028, UUID: b3b7e28e-a000-3e17-bd86-6e97b9e28c11
[1390:1390:0118/113639.842211:VERBOSE2:bluetooth_gatt_service_client.cc(91)] Remote GATT service removed: /org/bluez/hci0/dev_7C_A8_45_6E_4E_59/service0028

Bluetooth folks, does this mean that the descriptor was removed just as the packet was about to be sent? Does this cause the Bluetooth stack to never actually call the callback/errback?
Owner: josephsih@chromium.org
Status: Assigned (was: Started)
Summary: BluetoothRemoteGattCharacteristicBlueZ::WriteRemoteCharacteristic() discards callbacks (was: BleConnectionManager stops trying to connect)
I talked to Miao offline about this, and she thinks that it is indeed a Bluetooth bug in which BluetoothRemoteGattCharacteristicBlueZ discards the callback and error callback when the descriptor is removed. Because the callbacks are never invoked, Instant Tethering code never learns that the packet was sent and never closes the connection.

Assigning this issue to Joseph to look at things further. Since we now suspect this is a Bluetooth bug, I've included more debug logs here: https://drive.google.com/open?id=1Qxbspo7Q4Z9PZCRpNvSNNg5XYi1uFdMm.
Project Member

Comment 4 by bugdroid1@chromium.org, Jan 22 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/fe22dec1fcd295c8681a6bf6c698c2469ceb3c90

commit fe22dec1fcd295c8681a6bf6c698c2469ceb3c90
Author: Kyle Horimoto <khorimoto@google.com>
Date: Mon Jan 22 23:42:45 2018

[CrOS Tether] Add a timeout for sending a message over BLE.

The Bluetooth APIs accept callback and error callback parameters when
sending a message, but we've recently discovered that there are some
bugs which can prevent these callbacks from being invoked. This CL adds
an extra timeout which ensures that even if the callbacks are not
invoked, BluetoothLowEnergyWeaveClientConnection handles this case
robustly.

This is not the full fix to https://crbug.com/803677, but it is a
partial fix which can avoid some nasty side effects for Instant
Tethering.

Bug: 803677, 672263
Change-Id: I982c08dfe726a3c3e3164aac87ca8bec10b68be7
Reviewed-on: https://chromium-review.googlesource.com/877339
Reviewed-by: Ryan Hansberry <hansberry@chromium.org>
Commit-Queue: Ryan Hansberry <hansberry@chromium.org>
Commit-Queue: Kyle Horimoto <khorimoto@chromium.org>
Cr-Commit-Position: refs/heads/master@{#531062}
[modify] https://crrev.com/fe22dec1fcd295c8681a6bf6c698c2469ceb3c90/components/cryptauth/ble/bluetooth_low_energy_weave_client_connection.cc
[modify] https://crrev.com/fe22dec1fcd295c8681a6bf6c698c2469ceb3c90/components/cryptauth/ble/bluetooth_low_energy_weave_client_connection.h
[modify] https://crrev.com/fe22dec1fcd295c8681a6bf6c698c2469ceb3c90/components/cryptauth/ble/bluetooth_low_energy_weave_client_connection_unittest.cc
[modify] https://crrev.com/fe22dec1fcd295c8681a6bf6c698c2469ceb3c90/tools/metrics/histograms/enums.xml

Labels: Merge-Request-65
Labels: -Merge-Request-65 Merge-Approved-65
Project Member

Comment 7 by bugdroid1@chromium.org, Jan 23 2018

Labels: -merge-approved-65 merge-merged-3325
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/456a4076ac91b56affd8fcd0a9898104bad45ccc

commit 456a4076ac91b56affd8fcd0a9898104bad45ccc
Author: Kyle Horimoto <khorimoto@google.com>
Date: Tue Jan 23 22:32:22 2018

[CrOS Tether] Add a timeout for sending a message over BLE.

The Bluetooth APIs accept callback and error callback parameters when
sending a message, but we've recently discovered that there are some
bugs which can prevent these callbacks from being invoked. This CL adds
an extra timeout which ensures that even if the callbacks are not
invoked, BluetoothLowEnergyWeaveClientConnection handles this case
robustly.

This is not the full fix to https://crbug.com/803677, but it is a
partial fix which can avoid some nasty side effects for Instant
Tethering.

TBR=khorimoto@google.com

(cherry picked from commit fe22dec1fcd295c8681a6bf6c698c2469ceb3c90)

Bug: 803677, 672263
Change-Id: I982c08dfe726a3c3e3164aac87ca8bec10b68be7
Reviewed-on: https://chromium-review.googlesource.com/877339
Reviewed-by: Ryan Hansberry <hansberry@chromium.org>
Commit-Queue: Ryan Hansberry <hansberry@chromium.org>
Commit-Queue: Kyle Horimoto <khorimoto@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#531062}
Reviewed-on: https://chromium-review.googlesource.com/882216
Reviewed-by: Kyle Horimoto <khorimoto@chromium.org>
Cr-Commit-Position: refs/branch-heads/3325@{#39}
Cr-Branched-From: bc084a8b5afa3744a74927344e304c02ae54189f-refs/heads/master@{#530369}
[modify] https://crrev.com/456a4076ac91b56affd8fcd0a9898104bad45ccc/components/cryptauth/ble/bluetooth_low_energy_weave_client_connection.cc
[modify] https://crrev.com/456a4076ac91b56affd8fcd0a9898104bad45ccc/components/cryptauth/ble/bluetooth_low_energy_weave_client_connection.h
[modify] https://crrev.com/456a4076ac91b56affd8fcd0a9898104bad45ccc/components/cryptauth/ble/bluetooth_low_energy_weave_client_connection_unittest.cc
[modify] https://crrev.com/456a4076ac91b56affd8fcd0a9898104bad45ccc/tools/metrics/histograms/enums.xml

Joseph: I've landed a defensive fix for this on my side (and merged it to M-65), but the actual bug in Bluetooth still needs to be fixed. Keeping this bug open for your fix.
Hi Kyle, is it possible that the problem is on the android side? I encountered this issue from time to time. It seems to me that restarting the phone would make the connection work.
Hi Joseph - this should be agnostic to the Android side and should behave correctly either way. Here's what is happening:

(1) Instant Tethering calls BluetoothRemoteGattCharacteristicBlueZ::WriteRemoteCharacteristic() to send a message to the phone. It passes a success and error callback.
(2) The phone disconnects just as this is occurring.
(3) Neither the success or error callbacks is invoked. We get stuck.

The correct thing to do is simply to call the error callback if this occurs instead of just not calling any callback.

Looking at the code, it seems that the callbacks get passed to [1]. Then, if the BluetoothGattCharacteristicClientImpl object is deleted, the callbacks get dropped since its WeakPtrFactory is deleted. I didn't read the class thoroughly, but it seems to me that the correct thing to do would be to invoke any pending error callbacks in the destructor of this class.

[1] https://cs.chromium.org/chromium/src/device/bluetooth/dbus/bluetooth_gatt_characteristic_client.cc?q="void+WriteValue"
Labels: -Pri-1 -M-65 Pri-2
Downgrading priority since I've added a workaround for M-65.

Sign in to add a comment