New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 607384 link

Starred by 4 users

Issue metadata

Status: Verified
Owner:
Last visit > 30 days ago
Closed: Nov 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

Jerry terminates connection to PocketLab sensor

Project Member Reported by jleong@chromium.org, Apr 28 2016

Issue description

Attached are three test runs where Jerry terminated the Bluetooth connection to the PocketLab sensor. In the three cases, the app had been running between 1-2 hours when the disconnect occurred.

App can be found here: https://chrome.google.com/webstore/detail/pocketlab/hcgfdfgomnfebpnlkohhflcdhkllepdk
 
8284380771-system_logs.zip
269 KB Download
8284783519-system_logs.zip
367 KB Download
8285040063-system_logs.zip
450 KB Download
Hi Jian, the gadget looks cool :-)   However, it is not available in my area. I am wondering if it is possible that you could send one to me for further investigation? Thanks!

Comment 2 by jleong@chromium.org, Apr 29 2016

Cc: smbar...@chromium.org
Thanks, Joseph! The amazing smbarber@ will be heading to TPE next week, and he'll bring one for you!
That sounds great! Thanks Stephen and Jian!
Hi Jian, I first used bluez 5.37 to connect to the PocketLab. It did not disconnect after more than 3 hours.

Then I used the latest Jerry image (veyron_jerry R52-8315.0.0) with bluez 5.39 to connect to the PocketLab. No disconnection was observed either in 5 hours.

I noticed that the disconnection problem occurred in Comment#0 was with bluez 5.35. It is possible that the disconnection problem was fixed in recent versions.  Maybe you could try a newer image version to see if you could reproduce the disconnection problem? Thanks!

Comment 5 by jleong@chromium.org, May 13 2016

Cc: josephsih@chromium.org
Owner: nsale@chromium.org
Thanks, Joseph!

Reassigning to Nigel- could you help coordinate with CTL/Pocketlab on this?

Comment 6 by j...@zode64.com, May 24 2016

Hi Nigel, Joseph. I'm a developer working on the Chrome OS PocketLab app. Thanks for looking into it.

5 days ago I did an update of my Chrome OS version on my Chromebook which is now at 50.0.2661.91. I ran the test 2 times (one for 5 hours and one for 2 hours) before I encountered the issue so it does still exist in this version.

Does this version have a bluez version greater than 5.35?


Hi John, unfortunately, the current stable release 50.0.2661.xx still uses Bluez 5.35. You need to wait for a little while until 51.0.2672.0 or above is pushded.

Comment 8 by j...@zode64.com, May 24 2016

In the process of downloading version 51.0.2704.55 on the beta channel. I will let you know what I find.
Hi, John, when you observed the disconnection, what was the behavior? Did the chromebook try to re-connect to PocketLab automatically? Or Do you need to manually click in the bluetooth tray to reconnect?

Comment 10 by j...@zode64.com, May 24 2016

In the case of the 2 tests I am referring to I wasn't watching the screen but in the past I have observed a brief inconsistency in the frequency of readings followed by a disconnect. 

I never need to manually collect through the system tray. After a disconnect the device just returns to advertising mode and the app then detects it and I can select to connect to it. 

A couple of other observation which may or may not be useful are; there may be a pattern where by the time between beginning a connection and the disconnect decreases according to the number of previous disconnects or time already spent running the app. It also seems more likely to disconnect if the screen is allowed to sleep. However this behavior is by no means consistent and maybe just coincidence.
 
Hi John, have you successfully re-imaged 51.0.2704.55? Do you observe the disconnection issue on the newer image?

Comment 12 by j...@zode64.com, May 26 2016

Hi Joseph, Yes I ran the tests on 51.0.2704.55 and did observe the disconnection issue. 

After about 2 hours the Chromebook had been on low power for a while and when I woke it back up again it had crashed.

FYI I have seen the crash without it going to low power mode but it has happened more times in low power mode than normal mode.  
I am testing a PocketLab with 51.0.2704.55 again.

Hi John, in your test setup, did you leave the PocketLab app screen on the sensors menu page with 8 sensors? Or did you click into one of the sensors, for example temperature sensor?

Comment 14 by j...@zode64.com, May 26 2016

I clicked into the acceleration sensor screen (so I was graphing at the time).

I've never seen a disconnect on the sensor screen but it's not something I've tested for either.
I have been in the sensor screen for 60 minutes and do not observe the disconnection. 

Maybe that's why I did not observe the disconnection. I always tested it on the sensor screen without clicking into any particular sensor. Now let me test it again with clicking into acceleration.

Comment 16 by j...@zode64.com, May 26 2016

Yes, sorry this wasn't clear. You should try leaving it graphing.

Also it usually takes longer than 1 hours before observing the crash.
Hi John, I saw the disconnection in half an hour after clicking into the acceleration sensor. I will look further into the issue and get back to you later.
Hi John, do you ever have a chance to try PocketLab on other chromebook platforms? If so, do you see the same disconnection problem and which chromebook you use?

Comment 19 by j...@zode64.com, May 30 2016

I only have one chromebook which is the Dell Chromebook 11 but I believe that the problem has been replicated on all other devices we have tested on. I will get a list of these devices shortly.
This looks like a Marvell's bluetooth controller issue. Maybe we could have Marvell guys take a look?

From btmon log, PocketLab keeps sending data periodically until the device disconnected with reason 1.

> ACL Data RX: Handle 128 flags 0x02 dlen 13                 [hci0] 1777.195220
      ATT: Handle Value Notification (0x1b) len 8
        Handle: 0x0030
          Data: f4e000300c00
> ACL Data RX: Handle 128 flags 0x02 dlen 13                 [hci0] 1777.259709
      ATT: Handle Value Notification (0x1b) len 8
        Handle: 0x0030
          Data: f4c000200c20
> HCI Event: Disconnect Complete (0x05) plen 4               [hci0] 1777.733315
        Status: Success (0x00)
        Handle: 128
        Reason: Connection Timeout (0x08)
@ Device Disconnected: E0:E5:CF:A8:A2:B9 (1) reason 1


In the /var/log/messages, we could see the disconnection with reason 1 too as follows:

2016-05-30T12:02:16.739710+08:00 DEBUG bluetoothd[4907]: src/adapter.c:dev_disconnected() Device E0:E5:CF:A8:A2:B9 disconnected, reason 1
2016-05-30T12:02:16.740104+08:00 DEBUG bluetoothd[4907]: src/adapter.c:adapter_remove_connection() 
2016-05-30T12:02:16.740348+08:00 DEBUG bluetoothd[4907]: plugins/policy.c:disconnect_cb() reason 1
2016-05-30T12:02:16.740563+08:00 DEBUG bluetoothd[4907]: src/adapter.c:bonding_attempt_complete() hci0 bdaddr E0:E5:CF:A8:A2:B9 type 1 status 0xe
2016-05-30T12:02:16.740743+08:00 DEBUG bluetoothd[4907]: src/device.c:device_bonding_complete() bonding (nil) status 0x0e
2016-05-30T12:02:16.740951+08:00 DEBUG bluetoothd[4907]: src/device.c:device_bonding_failed() status 14
2016-05-30T12:02:16.741797+08:00 DEBUG bluetoothd[4907]: src/adapter.c:resume_discovery() 
2016-05-30T12:02:16.744375+08:00 DEBUG bluetoothd[4907]: src/device.c:att_disconnected_cb() 
2016-05-30T12:02:16.745328+08:00 DEBUG bluetoothd[4907]: src/device.c:att_disconnected_cb() Connection timed out (110)
2016-05-30T12:02:16.746188+08:00 DEBUG bluetoothd[4907]: src/gatt-client.c:btd_gatt_client_disconnected() Device disconnected. Cleaning up.
2016-05-30T12:02:16.747190+08:00 DEBUG bluetoothd[4907]: src/device.c:att_disconnected_cb() Automatic connection disabled
2016-05-30T12:02:16.751590+08:00 DEBUG bluetoothd[4907]: attrib/gattrib.c:g_attrib_unref() 0xb8d05a00: g_attrib_unref=0


The complete logs are attached. Please search the two logs with "### PocketLab was disconnected here." 

Generally, the disconnection could be observed within two hours. In my experiments, I consistently saw the disconnection in half an hour after establishing a connection from PocketLab web app to the PocketLab device, and then clicking into the "Acceleration" graph.
btmon_pocketlab_disconnection_0530b.log
6.3 MB View Download
messages_pocketlab_disconnection_0530b.log
986 KB View Download
Cc: akar...@marvell.com
Amitkumar, could you help take a look please?

Comment 22 by j...@zode64.com, May 31 2016

Information on device testing.

I think we've encountered a hang crash while using all of the devices below, but I'm not 100% sure.

Dell Chromebook 11
Google Pixel
CTL NL6
CTL J2

The Dell and the CTL NL6 have much better bluetooth connection stability than the other two computers.

I've often done testing with a stable BLE connection to the Dell and NL6 running for >2 hours.

The J2 and Pixel typically drop connection within ~30 min of testing.
Connection timeout happens due to no traffic for over 420ms which is the supervision timeout.
______________________________________________________
> ACL Data RX: Handle 128 flags 0x02 dlen 13                 [hci0] 1777.193482
      ATT: Handle Value Notification (0x1b) len 8
        Handle: 0x0030
          Data: f4e000800c40
> ACL Data RX: Handle 128 flags 0x02 dlen 13                 [hci0] 1777.195220
      ATT: Handle Value Notification (0x1b) len 8
        Handle: 0x0030
          Data: f4e000300c00
> ACL Data RX: Handle 128 flags 0x02 dlen 13                 [hci0] 1777.259709
      ATT: Handle Value Notification (0x1b) len 8
        Handle: 0x0030
          Data: f4c000200c20

### PocketLab was disconnected here.

> HCI Event: Disconnect Complete (0x05) plen 4               [hci0] 1777.733315
        Status: Success (0x00)
        Handle: 128
        Reason: Connection Timeout (0x08)
@ Device Disconnected: E0:E5:CF:A8:A2:B9 (1) reason 1
______________________________________________________

Time difference between disconnection and the previous Notification Packet was 473.606 ms.
Supervision Timeout for the link is 420 ms, as seen below:


______________________________________________________
< HCI Command: LE Create Connection (0x08|0x000d) plen 25      [hci0] 33.414825
        Scan interval: 60.000 msec (0x0060)
        Scan window: 60.000 msec (0x0060)
        Filter policy: White list is not used (0x00)
        Peer address type: Public (0x00)
        Peer address: E0:E5:CF:A8:A2:B9 (OUI E0-E5-CF)
        Own address type: Public (0x00)
        Min connection interval: 50.00 msec (0x0028)
        Max connection interval: 70.00 msec (0x0038)
        Connection latency: 0x0000
        Supervision timeout: 420 msec (0x002a)
        Min connection length: 0.000 msec (0x0000)
        Max connection length: 0.000 msec (0x0000)
______________________________________________________

I'd like to know more about the expected packet exchange - will there be continuous traffic, viz. the ATT Handle Notification? Will there be idle periods wherein only Nulls and Polls shall be exchanged? What would be the duration of these idle periods?

Since Nulls and Polls will not show up on the HCI logs, I need Sniffer logs to confirm that no packets are exchanged, from either Chromebook to PocketLab or vice-versa, for over 420ms.


Furthermore, from the messages.txt:
______________________________________________________
2016-05-30T11:20:52.609927+08:00 DEBUG bluetoothd[4907]: src/adapter.c:dev_disconnected() Device E0:E5:CF:A8:A2:B9 disconnected, reason 1
2016-05-30T11:20:52.609965+08:00 DEBUG bluetoothd[4907]: src/adapter.c:adapter_remove_connection() 
2016-05-30T11:20:52.609979+08:00 DEBUG bluetoothd[4907]: plugins/policy.c:disconnect_cb() reason 1
2016-05-30T11:20:52.609991+08:00 DEBUG bluetoothd[4907]: src/adapter.c:bonding_attempt_complete() hci0 bdaddr E0:E5:CF:A8:A2:B9 type 1 status 0xe
2016-05-30T11:20:52.610001+08:00 DEBUG bluetoothd[4907]: src/device.c:device_bonding_complete() bonding (nil) status 0x0e
______________________________________________________
The above instance of disconnection is much earlier than the one you have pointed out.
May I know more about this disconnection? Was this from a previous test?

Additionally, the subsequent Bonding attempt fails with 0x3E (Conn Rejected due to Security Reasons).
______________________________________________________

2016-05-30T11:20:52.609927+08:00 DEBUG bluetoothd[4907]: src/adapter.c:dev_disconnected() Device E0:E5:CF:A8:A2:B9 disconnected, reason 1
2016-05-30T11:20:52.609965+08:00 DEBUG bluetoothd[4907]: src/adapter.c:adapter_remove_connection() 
2016-05-30T11:20:52.609979+08:00 DEBUG bluetoothd[4907]: plugins/policy.c:disconnect_cb() reason 1
2016-05-30T11:20:52.609991+08:00 DEBUG bluetoothd[4907]: src/adapter.c:bonding_attempt_complete() hci0 bdaddr E0:E5:CF:A8:A2:B9 type 1 status 0xe
2016-05-30T11:20:52.610001+08:00 DEBUG bluetoothd[4907]: src/device.c:device_bonding_complete() bonding (nil) status 0x0e
2016-05-30T11:20:52.610013+08:00 DEBUG bluetoothd[4907]: src/device.c:device_bonding_failed() status 14
*******************************************************
2016-05-30T12:02:16.739710+08:00 DEBUG bluetoothd[4907]: src/adapter.c:dev_disconnected() Device E0:E5:CF:A8:A2:B9 disconnected, reason 1
2016-05-30T12:02:16.740104+08:00 DEBUG bluetoothd[4907]: src/adapter.c:adapter_remove_connection() 
2016-05-30T12:02:16.740348+08:00 DEBUG bluetoothd[4907]: plugins/policy.c:disconnect_cb() reason 1
2016-05-30T12:02:16.740563+08:00 DEBUG bluetoothd[4907]: src/adapter.c:bonding_attempt_complete() hci0 bdaddr E0:E5:CF:A8:A2:B9 type 1 status 0xe
2016-05-30T12:02:16.740743+08:00 DEBUG bluetoothd[4907]: src/device.c:device_bonding_complete() bonding (nil) status 0x0e
2016-05-30T12:02:16.740951+08:00 DEBUG bluetoothd[4907]: src/device.c:device_bonding_failed() status 14

______________________________________________________


Is the PocketLab device able to bond after the disconnection? Are all attempts failures? I only see two bonding attempts from the logs.
What is the method to recover from the issue state?

Comment 24 by j...@zode64.com, Jun 1 2016

I'll answer this to the best of my ability as the app developer and ask the guy working on the actual PocketLab device to have a look as well.

I have only seen this issue in the app during graphing. At this point it is receiving 10 notifications per second and there are no idle periods in theory.

On disconnect the app returns to a connection screen and waits for user input to connect to the device again. We have been considering implementing an automatic reconnect but it's not ideal and we wanted to see if we could get the underlying issue fixed.


HI rprashk, thank you for your detailed analysis!

Hi John, from rprashk's analysis, it is possible that the disconnection is caused by supervision timeout. Although 10 notifications per second are sent, there might be some deviations or interference. The controller would disconnect if a packet is not received within 420 ms which is the default supervision timeout of bluez. For other platforms, the supervision timeout may be different. For example, it seems that Mac OS set the supervision timeout to 2 seconds. In that case, the disconnection would be reduced significantly. 

Is it possible that PocketLab could send "Link Supervision Timeout Change" event to increase the supervision timeout, say to 2 seconds too, and see if that improves the connection stability? 

Comment 26 by j...@zode64.com, Jun 1 2016

Hi Joseph, I will get the PocketLab device guy to comment about what is possible on that side.

Is this value not configurable on the system (ChromeOS) side?

Thanks for the work done up to now.



Hi John, I quickly traced the bluez stack. It seemed to me that the parameter is for system wide. As other peripherals would use the value when connected to Chromebooks, we are not likely to simply change the value from our side. Hence, I think it makes sense for a peripheral to send the change request when creating a connection.


RPrashk and Joseph,

Thanks for the support. To answer some of your PocketLab BLE questions



I'd like to know more about the expected packet exchange - will there be continuous traffic, viz. the ATT Handle Notification? 

For all the PocketLab - Chromebook testing we connect the devices and monitor the acceleration graph for the duration of the test. One packet of sensor data is transmitted during each connection event. There is continuous traffic using the ATT Handle notification like you said and this occurs about every 66 ms. There are no idle periods when monitoring the acceleration graph. I have used a BLE packet sniffer to capture the communication between the PocketLab and Chromebook to verify.

I have attached a screenshot of the Packet Sniffer log and the full log from a disconnect event between the J2 Chromebook and the PocketLab. I used the TI SmartRF Packet Sniffer to capture the data.



Since Nulls and Polls will not show up on the HCI logs, I need Sniffer logs to confirm that no packets are exchanged, from either Chromebook to PocketLab or vice-versa, for over 420ms.

There should be no period where packets are not exchanged for longer than the connection interval set by the Chromebook. And when we typically see a disconnect event, we are sending data during every connection.



Is the PocketLab device able to bond after the disconnection? Are all attempts failures?

The PocketLab will go into advertising mode when a disconnect occurs, but we typically cannot re-connect to the Chromebook. 




What is the method to recover from the issue state?

We close the PocketLab app, re-launch, put the PocketLab sensor into advertising mode and attempt to re-connect. On re-connect we often do not see advertising devices. So then we shutdown the Chromebook and re-launch to fix.



disconnect J2 051716-0938.png
213 KB View Download
disconnect J2 051716-0938.psd
8.9 MB Download
Hi clifton, thank you for capturing the sniffer logs and asnwering the questions! I am not able to open the .psd file using GIMP. Is it a photoshop document file? Do you know what software is needed to open it? Or are you able to convert it to some other common types?

Hi rprashk, from the sniffer PNG figure, it looks that the sensor always responds in about 230 us in every connection interval. When the controller did not poll the sensor for an unusually long time (more than supervision timeout), the disconnection occurred. Could you take a look at the controller side? Thanks!


Joseph,

The .psd file is a Packet Sniffer Data file. Sorry, I can't export as any other file type. You can open it with the TI SmartRF Packet Sniffer Program.


Hi akarwar and rprashk, regarding Comment #29, have you found any possible causes? Thanks.
Hi Clifton and John, a patch has just landed to enlarge the supervision timeout value to 2 seconds. This should improve the connection stability. You would see the change in M58. Please conduct a test later to see if you still observe the disconnection issue in M58.
Clifton and John, would you comment on if this issue is resolved?
Status: Fixed (was: Assigned)
Please verify if the fix worked or if there are still issues.

Comment 35 by dchan@chromium.org, Jan 22 2018

Status: Archived (was: Fixed)

Comment 36 by dchan@chromium.org, Jan 23 2018

Status: Fixed (was: Archived)
Status: Verified (was: Fixed)

Sign in to add a comment