Jerry terminates connection to PocketLab sensor |
||||||||
Issue descriptionAttached are three test runs where Jerry terminated the Bluetooth connection to the PocketLab sensor. In the three cases, the app had been running between 1-2 hours when the disconnect occurred. App can be found here: https://chrome.google.com/webstore/detail/pocketlab/hcgfdfgomnfebpnlkohhflcdhkllepdk
,
Apr 29 2016
Thanks, Joseph! The amazing smbarber@ will be heading to TPE next week, and he'll bring one for you!
,
Apr 30 2016
That sounds great! Thanks Stephen and Jian!
,
May 13 2016
Hi Jian, I first used bluez 5.37 to connect to the PocketLab. It did not disconnect after more than 3 hours. Then I used the latest Jerry image (veyron_jerry R52-8315.0.0) with bluez 5.39 to connect to the PocketLab. No disconnection was observed either in 5 hours. I noticed that the disconnection problem occurred in Comment#0 was with bluez 5.35. It is possible that the disconnection problem was fixed in recent versions. Maybe you could try a newer image version to see if you could reproduce the disconnection problem? Thanks!
,
May 13 2016
Thanks, Joseph! Reassigning to Nigel- could you help coordinate with CTL/Pocketlab on this?
,
May 24 2016
Hi Nigel, Joseph. I'm a developer working on the Chrome OS PocketLab app. Thanks for looking into it. 5 days ago I did an update of my Chrome OS version on my Chromebook which is now at 50.0.2661.91. I ran the test 2 times (one for 5 hours and one for 2 hours) before I encountered the issue so it does still exist in this version. Does this version have a bluez version greater than 5.35?
,
May 24 2016
Hi John, unfortunately, the current stable release 50.0.2661.xx still uses Bluez 5.35. You need to wait for a little while until 51.0.2672.0 or above is pushded.
,
May 24 2016
In the process of downloading version 51.0.2704.55 on the beta channel. I will let you know what I find.
,
May 24 2016
Hi, John, when you observed the disconnection, what was the behavior? Did the chromebook try to re-connect to PocketLab automatically? Or Do you need to manually click in the bluetooth tray to reconnect?
,
May 24 2016
In the case of the 2 tests I am referring to I wasn't watching the screen but in the past I have observed a brief inconsistency in the frequency of readings followed by a disconnect. I never need to manually collect through the system tray. After a disconnect the device just returns to advertising mode and the app then detects it and I can select to connect to it. A couple of other observation which may or may not be useful are; there may be a pattern where by the time between beginning a connection and the disconnect decreases according to the number of previous disconnects or time already spent running the app. It also seems more likely to disconnect if the screen is allowed to sleep. However this behavior is by no means consistent and maybe just coincidence.
,
May 26 2016
Hi John, have you successfully re-imaged 51.0.2704.55? Do you observe the disconnection issue on the newer image?
,
May 26 2016
Hi Joseph, Yes I ran the tests on 51.0.2704.55 and did observe the disconnection issue. After about 2 hours the Chromebook had been on low power for a while and when I woke it back up again it had crashed. FYI I have seen the crash without it going to low power mode but it has happened more times in low power mode than normal mode.
,
May 26 2016
I am testing a PocketLab with 51.0.2704.55 again. Hi John, in your test setup, did you leave the PocketLab app screen on the sensors menu page with 8 sensors? Or did you click into one of the sensors, for example temperature sensor?
,
May 26 2016
I clicked into the acceleration sensor screen (so I was graphing at the time). I've never seen a disconnect on the sensor screen but it's not something I've tested for either.
,
May 26 2016
I have been in the sensor screen for 60 minutes and do not observe the disconnection. Maybe that's why I did not observe the disconnection. I always tested it on the sensor screen without clicking into any particular sensor. Now let me test it again with clicking into acceleration.
,
May 26 2016
Yes, sorry this wasn't clear. You should try leaving it graphing. Also it usually takes longer than 1 hours before observing the crash.
,
May 27 2016
Hi John, I saw the disconnection in half an hour after clicking into the acceleration sensor. I will look further into the issue and get back to you later.
,
May 30 2016
Hi John, do you ever have a chance to try PocketLab on other chromebook platforms? If so, do you see the same disconnection problem and which chromebook you use?
,
May 30 2016
I only have one chromebook which is the Dell Chromebook 11 but I believe that the problem has been replicated on all other devices we have tested on. I will get a list of these devices shortly.
,
May 30 2016
This looks like a Marvell's bluetooth controller issue. Maybe we could have Marvell guys take a look?
From btmon log, PocketLab keeps sending data periodically until the device disconnected with reason 1.
> ACL Data RX: Handle 128 flags 0x02 dlen 13 [hci0] 1777.195220
ATT: Handle Value Notification (0x1b) len 8
Handle: 0x0030
Data: f4e000300c00
> ACL Data RX: Handle 128 flags 0x02 dlen 13 [hci0] 1777.259709
ATT: Handle Value Notification (0x1b) len 8
Handle: 0x0030
Data: f4c000200c20
> HCI Event: Disconnect Complete (0x05) plen 4 [hci0] 1777.733315
Status: Success (0x00)
Handle: 128
Reason: Connection Timeout (0x08)
@ Device Disconnected: E0:E5:CF:A8:A2:B9 (1) reason 1
In the /var/log/messages, we could see the disconnection with reason 1 too as follows:
2016-05-30T12:02:16.739710+08:00 DEBUG bluetoothd[4907]: src/adapter.c:dev_disconnected() Device E0:E5:CF:A8:A2:B9 disconnected, reason 1
2016-05-30T12:02:16.740104+08:00 DEBUG bluetoothd[4907]: src/adapter.c:adapter_remove_connection()
2016-05-30T12:02:16.740348+08:00 DEBUG bluetoothd[4907]: plugins/policy.c:disconnect_cb() reason 1
2016-05-30T12:02:16.740563+08:00 DEBUG bluetoothd[4907]: src/adapter.c:bonding_attempt_complete() hci0 bdaddr E0:E5:CF:A8:A2:B9 type 1 status 0xe
2016-05-30T12:02:16.740743+08:00 DEBUG bluetoothd[4907]: src/device.c:device_bonding_complete() bonding (nil) status 0x0e
2016-05-30T12:02:16.740951+08:00 DEBUG bluetoothd[4907]: src/device.c:device_bonding_failed() status 14
2016-05-30T12:02:16.741797+08:00 DEBUG bluetoothd[4907]: src/adapter.c:resume_discovery()
2016-05-30T12:02:16.744375+08:00 DEBUG bluetoothd[4907]: src/device.c:att_disconnected_cb()
2016-05-30T12:02:16.745328+08:00 DEBUG bluetoothd[4907]: src/device.c:att_disconnected_cb() Connection timed out (110)
2016-05-30T12:02:16.746188+08:00 DEBUG bluetoothd[4907]: src/gatt-client.c:btd_gatt_client_disconnected() Device disconnected. Cleaning up.
2016-05-30T12:02:16.747190+08:00 DEBUG bluetoothd[4907]: src/device.c:att_disconnected_cb() Automatic connection disabled
2016-05-30T12:02:16.751590+08:00 DEBUG bluetoothd[4907]: attrib/gattrib.c:g_attrib_unref() 0xb8d05a00: g_attrib_unref=0
The complete logs are attached. Please search the two logs with "### PocketLab was disconnected here."
Generally, the disconnection could be observed within two hours. In my experiments, I consistently saw the disconnection in half an hour after establishing a connection from PocketLab web app to the PocketLab device, and then clicking into the "Acceleration" graph.
,
May 30 2016
Amitkumar, could you help take a look please?
,
May 31 2016
Information on device testing. I think we've encountered a hang crash while using all of the devices below, but I'm not 100% sure. Dell Chromebook 11 Google Pixel CTL NL6 CTL J2 The Dell and the CTL NL6 have much better bluetooth connection stability than the other two computers. I've often done testing with a stable BLE connection to the Dell and NL6 running for >2 hours. The J2 and Pixel typically drop connection within ~30 min of testing.
,
Jun 1 2016
Connection timeout happens due to no traffic for over 420ms which is the supervision timeout.
______________________________________________________
> ACL Data RX: Handle 128 flags 0x02 dlen 13 [hci0] 1777.193482
ATT: Handle Value Notification (0x1b) len 8
Handle: 0x0030
Data: f4e000800c40
> ACL Data RX: Handle 128 flags 0x02 dlen 13 [hci0] 1777.195220
ATT: Handle Value Notification (0x1b) len 8
Handle: 0x0030
Data: f4e000300c00
> ACL Data RX: Handle 128 flags 0x02 dlen 13 [hci0] 1777.259709
ATT: Handle Value Notification (0x1b) len 8
Handle: 0x0030
Data: f4c000200c20
### PocketLab was disconnected here.
> HCI Event: Disconnect Complete (0x05) plen 4 [hci0] 1777.733315
Status: Success (0x00)
Handle: 128
Reason: Connection Timeout (0x08)
@ Device Disconnected: E0:E5:CF:A8:A2:B9 (1) reason 1
______________________________________________________
Time difference between disconnection and the previous Notification Packet was 473.606 ms.
Supervision Timeout for the link is 420 ms, as seen below:
______________________________________________________
< HCI Command: LE Create Connection (0x08|0x000d) plen 25 [hci0] 33.414825
Scan interval: 60.000 msec (0x0060)
Scan window: 60.000 msec (0x0060)
Filter policy: White list is not used (0x00)
Peer address type: Public (0x00)
Peer address: E0:E5:CF:A8:A2:B9 (OUI E0-E5-CF)
Own address type: Public (0x00)
Min connection interval: 50.00 msec (0x0028)
Max connection interval: 70.00 msec (0x0038)
Connection latency: 0x0000
Supervision timeout: 420 msec (0x002a)
Min connection length: 0.000 msec (0x0000)
Max connection length: 0.000 msec (0x0000)
______________________________________________________
I'd like to know more about the expected packet exchange - will there be continuous traffic, viz. the ATT Handle Notification? Will there be idle periods wherein only Nulls and Polls shall be exchanged? What would be the duration of these idle periods?
Since Nulls and Polls will not show up on the HCI logs, I need Sniffer logs to confirm that no packets are exchanged, from either Chromebook to PocketLab or vice-versa, for over 420ms.
Furthermore, from the messages.txt:
______________________________________________________
2016-05-30T11:20:52.609927+08:00 DEBUG bluetoothd[4907]: src/adapter.c:dev_disconnected() Device E0:E5:CF:A8:A2:B9 disconnected, reason 1
2016-05-30T11:20:52.609965+08:00 DEBUG bluetoothd[4907]: src/adapter.c:adapter_remove_connection()
2016-05-30T11:20:52.609979+08:00 DEBUG bluetoothd[4907]: plugins/policy.c:disconnect_cb() reason 1
2016-05-30T11:20:52.609991+08:00 DEBUG bluetoothd[4907]: src/adapter.c:bonding_attempt_complete() hci0 bdaddr E0:E5:CF:A8:A2:B9 type 1 status 0xe
2016-05-30T11:20:52.610001+08:00 DEBUG bluetoothd[4907]: src/device.c:device_bonding_complete() bonding (nil) status 0x0e
______________________________________________________
The above instance of disconnection is much earlier than the one you have pointed out.
May I know more about this disconnection? Was this from a previous test?
Additionally, the subsequent Bonding attempt fails with 0x3E (Conn Rejected due to Security Reasons).
______________________________________________________
2016-05-30T11:20:52.609927+08:00 DEBUG bluetoothd[4907]: src/adapter.c:dev_disconnected() Device E0:E5:CF:A8:A2:B9 disconnected, reason 1
2016-05-30T11:20:52.609965+08:00 DEBUG bluetoothd[4907]: src/adapter.c:adapter_remove_connection()
2016-05-30T11:20:52.609979+08:00 DEBUG bluetoothd[4907]: plugins/policy.c:disconnect_cb() reason 1
2016-05-30T11:20:52.609991+08:00 DEBUG bluetoothd[4907]: src/adapter.c:bonding_attempt_complete() hci0 bdaddr E0:E5:CF:A8:A2:B9 type 1 status 0xe
2016-05-30T11:20:52.610001+08:00 DEBUG bluetoothd[4907]: src/device.c:device_bonding_complete() bonding (nil) status 0x0e
2016-05-30T11:20:52.610013+08:00 DEBUG bluetoothd[4907]: src/device.c:device_bonding_failed() status 14
*******************************************************
2016-05-30T12:02:16.739710+08:00 DEBUG bluetoothd[4907]: src/adapter.c:dev_disconnected() Device E0:E5:CF:A8:A2:B9 disconnected, reason 1
2016-05-30T12:02:16.740104+08:00 DEBUG bluetoothd[4907]: src/adapter.c:adapter_remove_connection()
2016-05-30T12:02:16.740348+08:00 DEBUG bluetoothd[4907]: plugins/policy.c:disconnect_cb() reason 1
2016-05-30T12:02:16.740563+08:00 DEBUG bluetoothd[4907]: src/adapter.c:bonding_attempt_complete() hci0 bdaddr E0:E5:CF:A8:A2:B9 type 1 status 0xe
2016-05-30T12:02:16.740743+08:00 DEBUG bluetoothd[4907]: src/device.c:device_bonding_complete() bonding (nil) status 0x0e
2016-05-30T12:02:16.740951+08:00 DEBUG bluetoothd[4907]: src/device.c:device_bonding_failed() status 14
______________________________________________________
Is the PocketLab device able to bond after the disconnection? Are all attempts failures? I only see two bonding attempts from the logs.
What is the method to recover from the issue state?
,
Jun 1 2016
I'll answer this to the best of my ability as the app developer and ask the guy working on the actual PocketLab device to have a look as well. I have only seen this issue in the app during graphing. At this point it is receiving 10 notifications per second and there are no idle periods in theory. On disconnect the app returns to a connection screen and waits for user input to connect to the device again. We have been considering implementing an automatic reconnect but it's not ideal and we wanted to see if we could get the underlying issue fixed.
,
Jun 1 2016
HI rprashk, thank you for your detailed analysis! Hi John, from rprashk's analysis, it is possible that the disconnection is caused by supervision timeout. Although 10 notifications per second are sent, there might be some deviations or interference. The controller would disconnect if a packet is not received within 420 ms which is the default supervision timeout of bluez. For other platforms, the supervision timeout may be different. For example, it seems that Mac OS set the supervision timeout to 2 seconds. In that case, the disconnection would be reduced significantly. Is it possible that PocketLab could send "Link Supervision Timeout Change" event to increase the supervision timeout, say to 2 seconds too, and see if that improves the connection stability?
,
Jun 1 2016
Hi Joseph, I will get the PocketLab device guy to comment about what is possible on that side. Is this value not configurable on the system (ChromeOS) side? Thanks for the work done up to now.
,
Jun 1 2016
Hi John, I quickly traced the bluez stack. It seemed to me that the parameter is for system wide. As other peripherals would use the value when connected to Chromebooks, we are not likely to simply change the value from our side. Hence, I think it makes sense for a peripheral to send the change request when creating a connection.
,
Jun 6 2016
RPrashk and Joseph, Thanks for the support. To answer some of your PocketLab BLE questions I'd like to know more about the expected packet exchange - will there be continuous traffic, viz. the ATT Handle Notification? For all the PocketLab - Chromebook testing we connect the devices and monitor the acceleration graph for the duration of the test. One packet of sensor data is transmitted during each connection event. There is continuous traffic using the ATT Handle notification like you said and this occurs about every 66 ms. There are no idle periods when monitoring the acceleration graph. I have used a BLE packet sniffer to capture the communication between the PocketLab and Chromebook to verify. I have attached a screenshot of the Packet Sniffer log and the full log from a disconnect event between the J2 Chromebook and the PocketLab. I used the TI SmartRF Packet Sniffer to capture the data. Since Nulls and Polls will not show up on the HCI logs, I need Sniffer logs to confirm that no packets are exchanged, from either Chromebook to PocketLab or vice-versa, for over 420ms. There should be no period where packets are not exchanged for longer than the connection interval set by the Chromebook. And when we typically see a disconnect event, we are sending data during every connection. Is the PocketLab device able to bond after the disconnection? Are all attempts failures? The PocketLab will go into advertising mode when a disconnect occurs, but we typically cannot re-connect to the Chromebook. What is the method to recover from the issue state? We close the PocketLab app, re-launch, put the PocketLab sensor into advertising mode and attempt to re-connect. On re-connect we often do not see advertising devices. So then we shutdown the Chromebook and re-launch to fix.
,
Jun 14 2016
Hi clifton, thank you for capturing the sniffer logs and asnwering the questions! I am not able to open the .psd file using GIMP. Is it a photoshop document file? Do you know what software is needed to open it? Or are you able to convert it to some other common types? Hi rprashk, from the sniffer PNG figure, it looks that the sensor always responds in about 230 us in every connection interval. When the controller did not poll the sensor for an unusually long time (more than supervision timeout), the disconnection occurred. Could you take a look at the controller side? Thanks!
,
Jun 20 2016
Joseph, The .psd file is a Packet Sniffer Data file. Sorry, I can't export as any other file type. You can open it with the TI SmartRF Packet Sniffer Program.
,
Dec 6 2016
Hi akarwar and rprashk, regarding Comment #29, have you found any possible causes? Thanks.
,
Feb 25 2017
Hi Clifton and John, a patch has just landed to enlarge the supervision timeout value to 2 seconds. This should improve the connection stability. You would see the change in M58. Please conduct a test later to see if you still observe the disconnection issue in M58.
,
Oct 27 2017
Clifton and John, would you comment on if this issue is resolved?
,
Nov 20 2017
Please verify if the fix worked or if there are still issues.
,
Jan 22 2018
,
Jan 23 2018
,
Sep 24
|
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by josephsih@chromium.org
, Apr 28 2016