New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 770863 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Oct 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

Crash in BleSynchronizer when bluetoothd has died

Project Member Reported by khorimoto@chromium.org, Oct 2 2017

Issue description

Repro:
(1) Start with Instant Tethering enabled.
(2) In terminal, run `restart bluetoothd`.
(3) Start scanning for tether hosts.
(4) Disable Mobile data in settings.

Expected: Mobile data gets disabled.

Actual: Crash.

The crash occurs in BleSynchronizer::CompleteCurrentCommand() (see [1]). The crash is due to referencing clock_. It is possible that one of the BLE callbacks results in the Tether component getting shut down, so by the time clock_ is referenced, it has been deleted.

The fix is to invoke CompleteCurrentCommand() asynchronously by posting the task as a new runnable on the current message loop.

[1] https://cs.chromium.org/chromium/src/chromeos/components/tether/ble_synchronizer.cc?q=CompleteCurrentCommand
 
Labels: ReleaseBlock-Stable
Labels: Merge-Request-62
Requesting M-62 merge for https://chromium-review.googlesource.com/c/chromium/src/+/696331.
Project Member

Comment 3 by sheriffbot@chromium.org, Oct 3 2017

Labels: -Merge-Request-62 Merge-Review-62 Hotlist-Merge-Review
This bug requires manual review: We are only 13 days from stable.
Please contact the milestone owner if you have questions.
Owners: amineer@(Android), cmasso@(iOS), bhthompson@(ChromeOS), abdulsyed@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Labels: -Hotlist-Merge-Review -Merge-Review-62 Merge-Approved-62
Approved for 62.
Project Member

Comment 5 by bugdroid1@chromium.org, Oct 3 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/604aeb5efed33c8cefa53adf493c638a498957ff

commit 604aeb5efed33c8cefa53adf493c638a498957ff
Author: Kyle Horimoto <khorimoto@google.com>
Date: Tue Oct 03 22:00:48 2017

[CrOS Tether] Fix crash in BleSynchronizer.

The crash occurred in the following situation:
(1) Tether component gets shut down, but BLE advertising/scanning is
    still active.
(2) Tether component shuts down asynchronously by waiting for active
    advertising/scanning to finish, then deleting all Tether-related
    objects.
(3) BleSynchronizer finishes a command (either unregister advertisement
    or stop discovery session) and alerts clients.
(4) Client discovers that asynchronous shutdown is complete and deletes
    the Tether component.
(5) BleSynchronizer tries to execute the next command, but it cannot
    because it has already been deleted. A crash occurs because deleted
    memory is referenced.

This CL fixes the issue by executing the next command as a new task,
ensuring that the next command will not be executed if BleSynchronizer
has been deleted.

Bug:  770863 , 672263
Change-Id: I52f08990ab08b5de0f506d214fc0b1b8544d5b0c
Reviewed-on: https://chromium-review.googlesource.com/696331
Commit-Queue: Kyle Horimoto <khorimoto@chromium.org>
Reviewed-by: Ryan Hansberry <hansberry@chromium.org>
Cr-Commit-Position: refs/heads/master@{#506194}
[modify] https://crrev.com/604aeb5efed33c8cefa53adf493c638a498957ff/chromeos/components/tether/BUILD.gn
[modify] https://crrev.com/604aeb5efed33c8cefa53adf493c638a498957ff/chromeos/components/tether/ble_advertiser.cc
[modify] https://crrev.com/604aeb5efed33c8cefa53adf493c638a498957ff/chromeos/components/tether/ble_advertiser.h
[modify] https://crrev.com/604aeb5efed33c8cefa53adf493c638a498957ff/chromeos/components/tether/ble_advertiser_unittest.cc
[modify] https://crrev.com/604aeb5efed33c8cefa53adf493c638a498957ff/chromeos/components/tether/ble_scanner.cc
[modify] https://crrev.com/604aeb5efed33c8cefa53adf493c638a498957ff/chromeos/components/tether/ble_scanner.h
[modify] https://crrev.com/604aeb5efed33c8cefa53adf493c638a498957ff/chromeos/components/tether/ble_synchronizer.cc
[modify] https://crrev.com/604aeb5efed33c8cefa53adf493c638a498957ff/chromeos/components/tether/ble_synchronizer.h
[add] https://crrev.com/604aeb5efed33c8cefa53adf493c638a498957ff/chromeos/components/tether/ble_synchronizer_base.cc
[add] https://crrev.com/604aeb5efed33c8cefa53adf493c638a498957ff/chromeos/components/tether/ble_synchronizer_base.h
[modify] https://crrev.com/604aeb5efed33c8cefa53adf493c638a498957ff/chromeos/components/tether/ble_synchronizer_unittest.cc
[modify] https://crrev.com/604aeb5efed33c8cefa53adf493c638a498957ff/chromeos/components/tether/error_tolerant_ble_advertisement_impl.cc
[modify] https://crrev.com/604aeb5efed33c8cefa53adf493c638a498957ff/chromeos/components/tether/error_tolerant_ble_advertisement_impl.h
[modify] https://crrev.com/604aeb5efed33c8cefa53adf493c638a498957ff/chromeos/components/tether/fake_ble_synchronizer.cc
[modify] https://crrev.com/604aeb5efed33c8cefa53adf493c638a498957ff/chromeos/components/tether/fake_ble_synchronizer.h

Project Member

Comment 6 by bugdroid1@chromium.org, Oct 3 2017

Labels: -merge-approved-62 merge-merged-3202
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/4ee2c1c6bf38244e5b77d9ec5d7d95b0e9421943

commit 4ee2c1c6bf38244e5b77d9ec5d7d95b0e9421943
Author: Kyle Horimoto <khorimoto@google.com>
Date: Tue Oct 03 22:17:13 2017

[CrOS Tether] Fix crash in BleSynchronizer.

The crash occurred in the following situation:
(1) Tether component gets shut down, but BLE advertising/scanning is
    still active.
(2) Tether component shuts down asynchronously by waiting for active
    advertising/scanning to finish, then deleting all Tether-related
    objects.
(3) BleSynchronizer finishes a command (either unregister advertisement
    or stop discovery session) and alerts clients.
(4) Client discovers that asynchronous shutdown is complete and deletes
    the Tether component.
(5) BleSynchronizer tries to execute the next command, but it cannot
    because it has already been deleted. A crash occurs because deleted
    memory is referenced.

This CL fixes the issue by executing the next command as a new task,
ensuring that the next command will not be executed if BleSynchronizer
has been deleted.

TBR=khorimoto@google.com

(cherry picked from commit 604aeb5efed33c8cefa53adf493c638a498957ff)

Bug:  770863 , 672263
Change-Id: I52f08990ab08b5de0f506d214fc0b1b8544d5b0c
Reviewed-on: https://chromium-review.googlesource.com/696331
Commit-Queue: Kyle Horimoto <khorimoto@chromium.org>
Reviewed-by: Ryan Hansberry <hansberry@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#506194}
Reviewed-on: https://chromium-review.googlesource.com/699060
Reviewed-by: Kyle Horimoto <khorimoto@chromium.org>
Cr-Commit-Position: refs/branch-heads/3202@{#566}
Cr-Branched-From: fa6a5d87adff761bc16afc5498c3f5944c1daa68-refs/heads/master@{#499098}
[modify] https://crrev.com/4ee2c1c6bf38244e5b77d9ec5d7d95b0e9421943/chromeos/components/tether/BUILD.gn
[modify] https://crrev.com/4ee2c1c6bf38244e5b77d9ec5d7d95b0e9421943/chromeos/components/tether/ble_advertiser.cc
[modify] https://crrev.com/4ee2c1c6bf38244e5b77d9ec5d7d95b0e9421943/chromeos/components/tether/ble_advertiser.h
[modify] https://crrev.com/4ee2c1c6bf38244e5b77d9ec5d7d95b0e9421943/chromeos/components/tether/ble_advertiser_unittest.cc
[modify] https://crrev.com/4ee2c1c6bf38244e5b77d9ec5d7d95b0e9421943/chromeos/components/tether/ble_scanner.cc
[modify] https://crrev.com/4ee2c1c6bf38244e5b77d9ec5d7d95b0e9421943/chromeos/components/tether/ble_scanner.h
[modify] https://crrev.com/4ee2c1c6bf38244e5b77d9ec5d7d95b0e9421943/chromeos/components/tether/ble_synchronizer.cc
[modify] https://crrev.com/4ee2c1c6bf38244e5b77d9ec5d7d95b0e9421943/chromeos/components/tether/ble_synchronizer.h
[add] https://crrev.com/4ee2c1c6bf38244e5b77d9ec5d7d95b0e9421943/chromeos/components/tether/ble_synchronizer_base.cc
[add] https://crrev.com/4ee2c1c6bf38244e5b77d9ec5d7d95b0e9421943/chromeos/components/tether/ble_synchronizer_base.h
[modify] https://crrev.com/4ee2c1c6bf38244e5b77d9ec5d7d95b0e9421943/chromeos/components/tether/ble_synchronizer_unittest.cc
[modify] https://crrev.com/4ee2c1c6bf38244e5b77d9ec5d7d95b0e9421943/chromeos/components/tether/error_tolerant_ble_advertisement_impl.cc
[modify] https://crrev.com/4ee2c1c6bf38244e5b77d9ec5d7d95b0e9421943/chromeos/components/tether/error_tolerant_ble_advertisement_impl.h
[modify] https://crrev.com/4ee2c1c6bf38244e5b77d9ec5d7d95b0e9421943/chromeos/components/tether/fake_ble_synchronizer.cc
[modify] https://crrev.com/4ee2c1c6bf38244e5b77d9ec5d7d95b0e9421943/chromeos/components/tether/fake_ble_synchronizer.h

Status: Fixed (was: Started)
Cc: mkarkada@chromium.org steve...@chromium.org dhadd...@chromium.org sdantul...@chromium.org
 Issue 773375  has been merged into this issue.
Issue 773389 has been merged into this issue.
Issue 773824 has been merged into this issue.
We are still seeing a crash with the same signature in recent Canaries.  We might have too little data to draw any conclusions, but it seems that the crash might be related to OOPIFs (there are 4 crashes in the site-per-process experiment group and 0 in the control group (*)).

Q: Should I reactivate this bug?  Open a new one?

(*) https://crash.corp.google.com/browse?q=product.name%3D%27Chrome_ChromeOS%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27chromeos%3A%3Atether%3A%3ABleSynchronizer%3A%3AOnAdvertisementUnregistered%27&sql_dialect=googlesql&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D&compProp=custom_data.ChromeCrashProto.experiments.ids&v1=47edb3-d3f2b6d3&v2=47edb3-e5224fcd
lukasza@: This change should be completely unrelated to OOPIFs, so I have no reason to believe that the experiment would cause this bug.

However, I just looked through all of the crashes with that signature, and they are all crashes on ble_synchronizer.cc:296. Line 296 was removed in part of the fix for this bug (the file is now <296 lines total). This is confusing to me since Canary should have the newest version of the file. Can you help provide me with an actionable stack trace? Thanks! :)
Good point - all 6 crashes I see in the site-per-process experiment group happen in 63.0.3230.0 (so before r506194 which initially landed in 63.0.3232.0).  Therefore - it is probably okay to ignore the reports I've pointed out in #c11.

Thanks for looking!

Sign in to add a comment