New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 833389 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Closed: Apr 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

bluetoothd crash in prim_attr_cmp()

Project Member Reported by josephsih@chromium.org, Apr 16 2018

Issue description

OS: CHROMEOS_RELEASE_BUILDER_PATH=eve-release/R66-10452.22.0
    with latest bluez package on Apr 16 2018.

What steps will reproduce the problem?
Run magic tethering on Eve with 1 pixel phone.

What is the expected result?
It should run without bluetoothd crash.

What happens instead?
Bluetoothd crash is observed daily or every other day.

In the attached tarball of logs, bluetoothd segfault occurred twice in one hour.

2018-04-16T13:03:54.586304+08:00 INFO kernel: [  809.057963] bluetoothd[2634]: segfault at 2c ip 00005bd6b0dc416c sp 00007ffd546ebe80 error 4 in bluetoothd[5bd6b0d47000+c9000]
2018-04-16T13:07:54.612540+08:00 INFO kernel: [ 1049.092709] bluetoothd[6940]: segfault at 2c ip 00005cb257d3a16c sp 00007ffca66563b0 error 4 in bluetoothd[5cb257cbd000+c9000]

The backtrace of the core dump

#0  prim_attr_cmp (a=0x0, b=<optimized out>) at src/device.c:4005                                                                                                            
#1  0x00007b0009778de2 in g_slist_find_custom (list=0x5bd6b128d710, data=0x5bd6b127d5f0, func=0x5bd6b0dc4140 <prim_attr_cmp>) at ../../glib-2.50.3/glib/gslist.c:746         
#2  0x00005bd6b0dc190b in gatt_service_removed (attr=0x5bd6b127d5f0, user_data=0x5bd6b128a660) at src/device.c:4045                                                          
#3  0x00005bd6b0de08bc in queue_foreach (queue=0x5bd6b12792a0, function=0x5bd6b0deeae0 <handle_notify>, user_data=0x7ffd546ebfa0) at src/shared/queue.c:220                  
#4  0x00005bd6b0decb60 in notify_service_changed (db=0x5bd6b12862c0, service=0x5bd6b127c100, added=false) at src/shared/gatt-db.c:268                                        
#5  gatt_db_service_destroy (data=0x5bd6b127c100) at src/shared/gatt-db.c:279                                                                                                
#6  0x00005bd6b0de069f in queue_remove_all (queue=<optimized out>, function=<optimized out>, user_data=0x0, destroy=0x5bd6b0decb00 <gatt_db_service_destroy>)                
    at src/shared/queue.c:351                                                                                                                                                
#7  0x00005bd6b0decc62 in gatt_db_clear_range (db=0x5bd6b12862c0, start_handle=1, end_handle=65535) at src/shared/gatt-db.c:453                                              
#8  gatt_db_clear (db=0x5bd6b12862c0) at src/shared/gatt-db.c:413                                                                                                            
#9  0x00005bd6b0dc4860 in gatt_cache_cleanup (device=<optimized out>) at src/device.c:573                                                                                    
#10 gatt_client_cleanup (device=0x5bd6b128a660) at src/device.c:581                                                                                                          
#11 0x00005bd6b0dbe994 in attio_cleanup (device=0x5bd6b128a660) at src/device.c:609                                                                                          
#12 0x00005bd6b0dc1673 in device_free (user_data=0x5bd6b128a660) at src/device.c:667                                                                                         
#13 0x00005bd6b0ddba4f in remove_interface (data=0x5bd6b128ec00, name=0x5bd6b0e08604 "org.bluez.Device1") at gdbus/object.c:667                                              
#14 0x00005bd6b0ddb980 in g_dbus_unregister_interface (connection=0x5bd6b121a1b0, path=<optimized out>, name=0x5bd6b0e08604 "org.bluez.Device1") at gdbus/object.c:1391      
#15 0x00005bd6b0dbd2f0 in btd_device_unref (device=<optimized out>) at src/device.c:6573                                                                                     
#16 device_remove (device=0x5bd6b128a660, remove_stored=<optimized out>) at src/device.c:4515                                                                                
#17 0x00005bd6b0da35c8 in btd_adapter_remove_device (adapter=0x5bd6b1233660, dev=0x5bd6b128a660) at src/adapter.c:1248                                                       
#18 0x00005bd6b0da846c in dev_disconnected (adapter=0x5bd6b1233660, addr=0x5bd6b121cae9, reason=<optimized out>) at src/adapter.c:7456                                       
#19 0x00005bd6b0de1236 in request_complete (mgmt=<optimized out>, status=0 '\000', opcode=20, length=7, param=0x5bd6b121cae9, index=<optimized out>)                         
    at src/shared/mgmt.c:261                                                                                                                                                 
#20 can_read_data (io=<optimized out>, user_data=0x5bd6b121ca60) at src/shared/mgmt.c:353                                                                                    
#21 0x00005bd6b0deefc4 in watch_callback (channel=<optimized out>, cond=<optimized out>, user_data=0x7ffd546ebe8c) at src/shared/io-glib.c:170                               
#22 0x00007b00097596b6 in g_main_dispatch (context=<optimized out>) at ../../glib-2.50.3/glib/gmain.c:3203                                                                   
#23 g_main_context_dispatch (context=<optimized out>) at ../../glib-2.50.3/glib/gmain.c:3856                                                                                 
#24 0x00007b0009759a5e in g_main_context_iterate (context=<optimized out>, block=<optimized out>, dispatch=<optimized out>, self=<optimized out>)                            
    at ../../glib-2.50.3/glib/gmain.c:3929                                                                                                                                   
#25 0x00007b0009759d8f in g_main_loop_run (loop=0x5bd6b1211ed0) at ../../glib-2.50.3/glib/gmain.c:4125                                                                       
#26 0x00005bd6b0dc9334 in main (argc=1, argv=0x7ffd546ed608) at src/main.c:796

The corresponding source code

3997    static gint prim_attr_cmp(gconstpointer a, gconstpointer b)
3998    {
3999            const struct gatt_primary *prim = a;
4000            const struct gatt_db_attribute *attr = b;
4001            uint16_t start, end;
4002
4003            gatt_db_attribute_get_service_handles(attr, &start, &end);
4004
4005            return !(prim->range.start == start && prim->range.end == end);

Sometimes the pointer prim is NULL. The null pointer dereference at line 4005 caused the segmentation fault.
 
The logs including the crash core dumps were attached for C#0.
0416_1250b_mt_0x15_segfault_twice.tbz2
3.9 MB Download
Project Member

Comment 2 by bugdroid1@chromium.org, Apr 21 2018

Labels: merge-merged-chromeos-5.44
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/bluez/+/d8ee0901515c465d618fe3ab12ceec85dbf50972

commit d8ee0901515c465d618fe3ab12ceec85dbf50972
Author: Joseph Hwang <josephsih@chromium.org>
Date: Sat Apr 21 10:52:53 2018

CHROMIUM: device.c: sanity check of NULL pointers in prim_attr_cmp()

In some cases, the gatt_primary service structure may be NULL.
Do not dereference the NULL pointer when this occurs.

BUG= chromium:833389 ,b:78181184
TEST=Run magic tethering for a few hours. Observe that no segfault
occurs in prim_attr_cmp().

Change-Id: I26ba52233f720f8d6f28d4948bf6d9658333f251
Reviewed-on: https://chromium-review.googlesource.com/1016441
Commit-Ready: Shyh-In Hwang <josephsih@chromium.org>
Tested-by: Shyh-In Hwang <josephsih@chromium.org>
Reviewed-by: Dmitry Grinberg <dmitrygr@google.com>
Reviewed-by: Miao-chen Chou <mcchou@chromium.org>

[modify] https://crrev.com/d8ee0901515c465d618fe3ab12ceec85dbf50972/src/device.c

Project Member

Comment 3 by bugdroid1@chromium.org, Apr 26 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/bluez/+/d933f7ed8dd8cf2d29bae7a32e58e38c11cd89aa

commit d933f7ed8dd8cf2d29bae7a32e58e38c11cd89aa
Author: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Date: Thu Apr 26 06:00:40 2018

BACKPORT: device: Fix crash freeing device

Calling gatt_db_register with NULL pointers makes no sense since it does
nothing when the callbacks are NULL so the callback are still reachable
causing invalid memory to accessed:

Invalid read of size 8
   at 0x50EAFDC: g_slist_find_custom (in /usr/lib64/libglib-2.0.so.0.5000.3)
   by 0x46CDA1: gatt_service_removed (device.c:3563)
   by 0x4896F8: queue_foreach (queue.c:220)
   by 0x4951FB: notify_service_changed (gatt-db.c:268)
   by 0x4951FB: gatt_db_service_destroy (gatt-db.c:279)
   by 0x4898F5: queue_remove_all (queue.c:336)
   by 0x4952E2: gatt_db_clear_range (gatt-db.c:461)
   by 0x48F32B: discovery_op_unref (gatt-client.c:447)
   by 0x4979AA: bt_gatt_request_unref (gatt-helpers.c:594)
   by 0x490489: bt_gatt_client_cancel_all (gatt-client.c:2083)
   by 0x4904D8: bt_gatt_client_free (gatt-client.c:1752)
   by 0x46CF70: gatt_client_cleanup (device.c:561)
   by 0x46D01A: attio_cleanup (device.c:586)
 Address 0x86cb940 is 0 bytes inside a block of size 16 free'd
   at 0x4C2ED4A: free (vg_replace_malloc.c:530)
   by 0x50D16CD: g_free (in /usr/lib64/libglib-2.0.so.0.5000.3)
   by 0x50EA743: g_slice_free_chain_with_offset (in /usr/lib64/libglib-2.0.so.0.5000.3)
   by 0x46D18C: device_free (device.c:638)
   by 0x485B05: remove_interface (object.c:667)
   by 0x485FF9: g_dbus_unregister_interface (object.c:1391)
   by 0x45EFA9: btd_adapter_remove_device (adapter.c:1200)
   by 0x45FBC3: dev_disconnected (adapter.c:6800)
   by 0x48A1A5: request_complete (mgmt.c:261)
   by 0x48AC0B: can_read_data (mgmt.c:353)
   by 0x496954: watch_callback (io-glib.c:170)
   by 0x50CBE51: g_main_context_dispatch (in /usr/lib64/libglib-2.0.so.0.5000.3)

BUG= chromium:833389 ,b:78181184
TEST=Enable debug option of bluetoothd.
Run magic tethering for a few hours.
Observe that there are no warning messages as below in
/var/log/messages.

    WARNING bluetoothd[2657]: prim_attr_cmp: null gatt_primary

Change-Id: I90963d7eb84f62d123d5644f4ac66763568d47ec
Signed-off-by: Joseph Hwang <josephsih@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1023498
Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com>
Commit-Ready: Shyh-In Hwang <josephsih@chromium.org>
Tested-by: Shyh-In Hwang <josephsih@chromium.org>
Reviewed-by: Miao-chen Chou <mcchou@chromium.org>

[modify] https://crrev.com/d933f7ed8dd8cf2d29bae7a32e58e38c11cd89aa/src/device.c

Status: Fixed (was: Started)
Labels: Merge-Request-67
It is desirable to merge the 2 patches in C#2 and C#3 back to M-67 in order to fix bluetoothd crashes we saw frequently. Thanks!
Project Member

Comment 6 by sheriffbot@chromium.org, May 14 2018

Labels: -Merge-Request-67 Merge-Review-67 Hotlist-Merge-Review
This bug requires manual review: M67 has already been promoted to the beta branch, so this requires manual review
Please contact the milestone owner if you have questions.
Owners: cmasso@(Android), cmasso@(iOS), kbleicher@(ChromeOS), govind@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot

Comment 7 by gov...@chromium.org, May 14 2018

Labels: OS-Chrome
Has this change been extensively tested and confirmed working / low risk?  We're pretty far long with M67 so need to mitigate risk.  Esp given that this could have broad board impact. This wasn't a M67 regression, so prefer to wait for M68.
Here below was the test result from Pramod:

"I have tested on Eve on 10682.0.0 build, and the manual tests are all passed. No Regressions were observed."

I think it would be less risky by accepting the patches than postponing to next milestone. I personally used to experience the crash several times a day when running magic tethering. Thanks!
Labels: -Merge-Review-67 Merge-Approved-67
Approving merge to M67 Chrome OS.
Thanks, Kevin! Cherry-picking the two patches to M67.
Project Member

Comment 12 by bugdroid1@chromium.org, May 20 2018

Labels: merge-merged-release-R67-10575.B
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/bluez/+/e8a45ca64f97ccb0fac07b38829dd47d86de5326

commit e8a45ca64f97ccb0fac07b38829dd47d86de5326
Author: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Date: Sun May 20 14:46:54 2018

BACKPORT: device: Fix crash freeing device

Calling gatt_db_register with NULL pointers makes no sense since it does
nothing when the callbacks are NULL so the callback are still reachable
causing invalid memory to accessed:

Invalid read of size 8
   at 0x50EAFDC: g_slist_find_custom (in /usr/lib64/libglib-2.0.so.0.5000.3)
   by 0x46CDA1: gatt_service_removed (device.c:3563)
   by 0x4896F8: queue_foreach (queue.c:220)
   by 0x4951FB: notify_service_changed (gatt-db.c:268)
   by 0x4951FB: gatt_db_service_destroy (gatt-db.c:279)
   by 0x4898F5: queue_remove_all (queue.c:336)
   by 0x4952E2: gatt_db_clear_range (gatt-db.c:461)
   by 0x48F32B: discovery_op_unref (gatt-client.c:447)
   by 0x4979AA: bt_gatt_request_unref (gatt-helpers.c:594)
   by 0x490489: bt_gatt_client_cancel_all (gatt-client.c:2083)
   by 0x4904D8: bt_gatt_client_free (gatt-client.c:1752)
   by 0x46CF70: gatt_client_cleanup (device.c:561)
   by 0x46D01A: attio_cleanup (device.c:586)
 Address 0x86cb940 is 0 bytes inside a block of size 16 free'd
   at 0x4C2ED4A: free (vg_replace_malloc.c:530)
   by 0x50D16CD: g_free (in /usr/lib64/libglib-2.0.so.0.5000.3)
   by 0x50EA743: g_slice_free_chain_with_offset (in /usr/lib64/libglib-2.0.so.0.5000.3)
   by 0x46D18C: device_free (device.c:638)
   by 0x485B05: remove_interface (object.c:667)
   by 0x485FF9: g_dbus_unregister_interface (object.c:1391)
   by 0x45EFA9: btd_adapter_remove_device (adapter.c:1200)
   by 0x45FBC3: dev_disconnected (adapter.c:6800)
   by 0x48A1A5: request_complete (mgmt.c:261)
   by 0x48AC0B: can_read_data (mgmt.c:353)
   by 0x496954: watch_callback (io-glib.c:170)
   by 0x50CBE51: g_main_context_dispatch (in /usr/lib64/libglib-2.0.so.0.5000.3)

BUG= chromium:833389 ,b:78181184
TEST=Enable debug option of bluetoothd.
Run magic tethering for a few hours.
Observe that there are no warning messages as below in
/var/log/messages.

    WARNING bluetoothd[2657]: prim_attr_cmp: null gatt_primary

Change-Id: I90963d7eb84f62d123d5644f4ac66763568d47ec
Signed-off-by: Joseph Hwang <josephsih@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1023498
Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com>
Commit-Ready: Shyh-In Hwang <josephsih@chromium.org>
Tested-by: Shyh-In Hwang <josephsih@chromium.org>
Reviewed-by: Miao-chen Chou <mcchou@chromium.org>
(cherry picked from commit d933f7ed8dd8cf2d29bae7a32e58e38c11cd89aa)
Reviewed-on: https://chromium-review.googlesource.com/1063051
Reviewed-by: Sonny Sasaka <sonnysasaka@chromium.org>

[modify] https://crrev.com/e8a45ca64f97ccb0fac07b38829dd47d86de5326/src/device.c

Project Member

Comment 13 by bugdroid1@chromium.org, May 20 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/bluez/+/4abb61ce31f253e6a91d29e75c9eb2eeda6e2dcc

commit 4abb61ce31f253e6a91d29e75c9eb2eeda6e2dcc
Author: Joseph Hwang <josephsih@chromium.org>
Date: Sun May 20 14:46:55 2018

CHROMIUM: device.c: sanity check of NULL pointers in prim_attr_cmp()

In some cases, the gatt_primary service structure may be NULL.
Do not dereference the NULL pointer when this occurs.

BUG= chromium:833389 ,b:78181184
TEST=Run magic tethering for a few hours. Observe that no segfault
occurs in prim_attr_cmp().

Change-Id: I26ba52233f720f8d6f28d4948bf6d9658333f251
Reviewed-on: https://chromium-review.googlesource.com/1016441
Commit-Ready: Shyh-In Hwang <josephsih@chromium.org>
Tested-by: Shyh-In Hwang <josephsih@chromium.org>
Reviewed-by: Dmitry Grinberg <dmitrygr@google.com>
Reviewed-by: Miao-chen Chou <mcchou@chromium.org>
(cherry picked from commit d8ee0901515c465d618fe3ab12ceec85dbf50972)
Reviewed-on: https://chromium-review.googlesource.com/1063050
Reviewed-by: Sonny Sasaka <sonnysasaka@chromium.org>
Commit-Queue: Shyh-In Hwang <josephsih@chromium.org>
Trybot-Ready: Shyh-In Hwang <josephsih@chromium.org>

[modify] https://crrev.com/4abb61ce31f253e6a91d29e75c9eb2eeda6e2dcc/src/device.c

Project Member

Comment 14 by sheriffbot@chromium.org, May 21 2018

Cc: kbleicher@chromium.org
This issue has been approved for a merge. Please merge the fix to any appropriate branches as soon as possible!

If all merges have been completed, please remove any remaining Merge-Approved labels from this issue.

Thanks for your time! To disable nags, add the Disable-Nags label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Labels: -Merge-Approved-67
Status: Verified (was: Fixed)
Thanks for reminding!

Marked it as Verified since Pramod has conducted manual tests.

Sign in to add a comment