New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 748407 link

Starred by 1 user

Issue metadata

Status: Started
Owner:
Last visit > 30 days ago
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug

Blocking:
issue 739688



Sign in to add a comment

wpa_supplicant crashes when loading pkcs11.so

Project Member Reported by benchan@chromium.org, Jul 25 2017

Issue description

Observed the following wpa_supplicant crash on ToT when running some autotests that involves interactions with shill over ShillProxy (e.g. shill_proxy.find_object). The crash site seems to be at `OPENSSL_free(ctx->module)`, which looks like a memory corruption or double free.

INFO    : Dumping stack for wpa_supplicant.20170725.001302.633.dmp with /build/coral/usr/lib/debug/breakpad:
Operating system: Linux
                  0.0.0 Linux 4.4.70-11168-gfbf429785e0e #1 SMP PREEMPT Mon Jul 24 22:18:24 PDT 2017 x86_64
CPU: amd64
     family 6 model 92 stepping 9
     1 CPU

GPU: UNKNOWN

Crash reason:  SIGABRT
Crash address: 0x0
Process uptime: not available

Thread 0 (crashed)
 0  libc-2.23.so!raise [raise.c : 54 + 0x10]
    rax = 0x0000000000000000   rdx = 0x0000000000000006
    rcx = 0x000079713c857eb2   rbx = 0x0000000000000061
    rsi = 0x0000000000000279   rdi = 0x0000000000000279
    rbp = 0x00007ffc1fb7f130   rsp = 0x00007ffc1fb7f008
     r8 = 0x000000000000000f    r9 = 0x0000000000000000
    r10 = 0x0000000000000008   r11 = 0x0000000000000202
    r12 = 0x0000000000000061   r13 = 0x00007ffc1fb7f1b8
    r14 = 0x00007ffc1fb7f1b8   r15 = 0x0000000000000002
    rip = 0x000079713c857eb2
    Found by: given as instruction pointer in context
 1  libc-2.23.so!abort [abort.c : 89 + 0xa]
    rbx = 0x0000000000000061   rbp = 0x00007ffc1fb7f130
    rsp = 0x00007ffc1fb7f010   r12 = 0x0000000000000061
    r13 = 0x00007ffc1fb7f1b8   r14 = 0x00007ffc1fb7f1b8
    r15 = 0x0000000000000002   rip = 0x000079713c859cd6
    Found by: call frame info
 2  libc-2.23.so!__libc_message [libc_fatal.c : 175 + 0x5]
    rbx = 0x0000000000000061   rbp = 0x00007ffc1fb7f3b0
    rsp = 0x00007ffc1fb7f140   r12 = 0x0000000000000061
    r13 = 0x00007ffc1fb7f1b8   r14 = 0x00007ffc1fb7f1b8
    r15 = 0x0000000000000002   rip = 0x000079713c899577
    Found by: call frame info
 3  libc-2.23.so!malloc_printerr [malloc.c : 5004 + 0x5]
    rbx = 0x00007ffc1fb80eb5   rbp = 0x00007ffc1fb7f400
    rsp = 0x00007ffc1fb7f3c0   r12 = 0x0000000000000003
    r13 = 0x000079713c99c860   r14 = 0x000057ddd2b37af0
    r15 = 0x000057ddd2b2fc10   rip = 0x000079713c89f6f1
    Found by: call frame info
 4  libcrypto.so.1.0.0!<name omitted> [mem.c : 434 + 0x9]
    rbx = 0x000079713d0b39a0   rbp = 0x00007ffc1fb7f420
    rsp = 0x00007ffc1fb7f410   r12 = 0x000057ddd2b37a10
    r13 = 0x000057ddd1a9d9b0   r14 = 0x000057ddd2b37af0
    r15 = 0x000057ddd2b2fc10   rip = 0x000079713cd62b15
    Found by: call frame info
 5  pkcs11.so!ctx_engine_ctrl [eng_back.c : 856 + 0x9]
    rbx = 0x000057ddd2b2fc10   rbp = 0x00007ffc1fb7f440
    rsp = 0x00007ffc1fb7f430   r12 = 0x000057ddd2b37a10
    r13 = 0x000057ddd1a9d9b0   r14 = 0x000057ddd2b37af0
    r15 = 0x000057ddd2b2fc10   rip = 0x000079713d0b5157
    Found by: call frame info
 6  libcrypto.so.1.0.0!ENGINE_ctrl_cmd_string [eng_ctrl.c : 357 + 0x13]
    rbx = 0x00000000000000c9   rbp = 0x00007ffc1fb7f480
    rsp = 0x00007ffc1fb7f450   r12 = 0x000057ddd2b37a10
    r13 = 0x000057ddd1a9d9b0   r14 = 0x00000000000000c9
    r15 = 0x000057ddd2b2fc10   rip = 0x000079713ccd837a
    Found by: call frame info
 7  wpa_supplicant!tls_engine_load_dynamic_generic [tls_openssl.c : 676 + 0xa]
    rbx = 0x00007ffc1fb7f540   rbp = 0x00007ffc1fb7f4c0
    rsp = 0x00007ffc1fb7f490   r12 = 0x00007ffc1fb7f4f0
    r13 = 0x000057ddd1a9d9b0   r14 = 0x000057ddd1aa6afd
    r15 = 0x000057ddd2b37a10   rip = 0x000057ddd1a1d714
    Found by: call frame info
 8  wpa_supplicant!tls_init [tls_openssl.c : 727 + 0x16]
    rbx = 0x00007ffc1fb7f540   rbp = 0x00007ffc1fb7f590
    rsp = 0x00007ffc1fb7f4d0   r12 = 0x000057ddd2b35160
    r13 = 0x000057ddd1a9d9b0   r14 = 0x000057ddd1a9d75a
    r15 = 0x000057ddd2b36260   rip = 0x000057ddd1a191b2
    Found by: call frame info
 9  wpa_supplicant!eap_peer_sm_init [eap.c : 1941 + 0x5]
    rbx = 0x000057ddd2b2f7f0   rbp = 0x00007ffc1fb7f610
    rsp = 0x00007ffc1fb7f5a0   r12 = 0x000057ddd1ab0090
    r13 = 0x000057ddd2b251b0   r14 = 0x00007ffc1fb7f620
    r15 = 0x000057ddd2b31b40   rip = 0x000057ddd1a13564
    Found by: call frame info
10  wpa_supplicant!eapol_sm_init [eapol_supp_sm.c : 2067 + 0x13]
    rbx = 0x000057ddd2b251b0   rbp = 0x00007ffc1fb7f670
    rsp = 0x00007ffc1fb7f620   r12 = 0x000057ddd2b31b40
    r13 = 0x000057ddd1a9f6d6   r14 = 0x000057ddd2b30850
    r15 = 0x000057ddd2b31c28   rip = 0x000057ddd1a1254b
    Found by: call frame info
11  wpa_supplicant!wpa_supplicant_init_eapol [wpas_glue.c : 993 + 0x8]
    rbx = 0x000057ddd2b30850   rbp = 0x00007ffc1fb7f690
    rsp = 0x00007ffc1fb7f680   r12 = 0x000057ddd2b31b40
    r13 = 0x000057ddd1a9f6d6   r14 = 0x000057ddd2b31b40
    r15 = 0x000057ddd2b31c28   rip = 0x000057ddd1a58752
    Found by: call frame info
12  wpa_supplicant!wpa_supplicant_add_iface [wpa_supplicant.c : 4478 + 0x8]
    rbx = 0x0000000000000000   rbp = 0x00007ffc1fb7f7b0
    rsp = 0x00007ffc1fb7f6a0   r12 = 0x000057ddd2b31b40
    r13 = 0x000057ddd1a9f6d6   r14 = 0x000057ddd2b40401
    r15 = 0x000057ddd2b31c28   rip = 0x000057ddd1a506da
    Found by: call frame info
13  wpa_supplicant!wpas_dbus_handler_create_interface [dbus_new_handlers.c : 617 + 0xa]
    rbx = 0x000057ddd2b309a0   rbp = 0x00007ffc1fb7f910
    rsp = 0x00007ffc1fb7f7c0   r12 = 0x000057ddd2b00af0
    r13 = 0x0000000000000000   r14 = 0x000057ddd2b03b70
    r15 = 0x000057ddd2b00af0   rip = 0x000057ddd1a384b9
    Found by: call frame info
14  wpa_supplicant!message_handler [dbus_new_helpers.c : 354 + 0x7]
    rbx = 0x0000000000000100   rbp = 0x00007ffc1fb7fb20
    rsp = 0x00007ffc1fb7f920   r12 = 0x000057ddd2b04050
    r13 = 0x000057ddd1ab05f0   r14 = 0x00007ffc1fb7f9f5
    r15 = 0x000057ddd1ab0608   rip = 0x000057ddd1a34d92
    Found by: call frame info
15  libdbus-1.so.3.14.8 + 0x2866a
    rbx = 0x0000000000000000   rbp = 0x00007ffc1fb7fc00
    rsp = 0x00007ffc1fb7fb30   r12 = 0x000057ddd2b01140
    r13 = 0x000057ddd1a34910   r14 = 0x0000000000000000
    r15 = 0x000057ddd2b01670   rip = 0x000079713d11666a
    Found by: call frame info
16  libdbus-1.so.3.14.8 + 0x170f6
    rbp = 0x00007ffc1fb7fcb0   rsp = 0x00007ffc1fb7fc10
    rip = 0x000079713d1050f6
    Found by: previous frame's frame pointer
17  wpa_supplicant!process_watch [dbus_common.c : 38 + 0x8]
    rbp = 0x00007ffc1fb7fce0   rsp = 0x00007ffc1fb7fcc0
    rip = 0x000057ddd1a3f8e8
    Found by: previous frame's frame pointer
18  wpa_supplicant!eloop_run [eloop.c : 495 + 0x4]
    rbx = 0x0000000000000000   rbp = 0x00007ffc1fb7fd60
    rsp = 0x00007ffc1fb7fcf0   r14 = 0x000057ddd1ab5850
    r15 = 0x0000000000000004   rip = 0x000057ddd19e830a
    Found by: call frame info
19  wpa_supplicant!wpa_supplicant_run [wpa_supplicant.c : 5033 + 0x5]
    rbx = 0x0000000000000001   rbp = 0x00007ffc1fb7fd90
    rsp = 0x00007ffc1fb7fd70   r12 = 0x000057ddd2b00610
    r13 = 0x000057ddd2b00610   r14 = 0x0000000000000001
    r15 = 0x000057ddd2b00af0   rip = 0x000057ddd1a5186f
    Found by: call frame info
20  wpa_supplicant!main [main.c : 345 + 0x8]
    rbx = 0x0000000000000001   rbp = 0x00007ffc1fb7fe80
    rsp = 0x00007ffc1fb7fda0   r12 = 0x000057ddd2b00610
    r13 = 0x000057ddd2b00610   r14 = 0x0000000000000001
    r15 = 0x000057ddd2b00af0   rip = 0x000057ddd1a5ddad
    Found by: call frame info
21  libc-2.23.so!__libc_start_main [libc-start.c : 289 + 0x1a]
    rbx = 0x0000000000000000   rbp = 0x00007ffc1fb7ff50
    rsp = 0x00007ffc1fb7fe90   r12 = 0x000057ddd1a7d7c0
    r13 = 0x00007ffc1fb7ff70   r14 = 0x0000000000000000
    r15 = 0x0000000000000000   rip = 0x000079713c844816
    Found by: call frame info
22  wpa_supplicant!_start + 0x29
    rbx = 0x0000000000000000   rbp = 0x0000000000000000
    rsp = 0x00007ffc1fb7ff60   r12 = 0x000057ddd19d8b80
    r13 = 0x00007ffc1fb7ff70   r14 = 0x0000000000000000
    r15 = 0x0000000000000000   rip = 0x000057ddd19d8ba9
    Found by: call frame info
23  0x7ffc1fb7ff68
    rbx = 0x0000000000000000   rbp = 0x0000000000000000
    rsp = 0x00007ffc1fb7ff68   r12 = 0x000057ddd19d8b80
    r13 = 0x00007ffc1fb7ff70   r14 = 0x0000000000000000
    r15 = 0x0000000000000000   rip = 0x00007ffc1fb7ff68
    Found by: call frame info
24  0x7ffc1fbcd000
    rsp = 0x00007ffc1fb80010   rip = 0x00007ffc1fbcd000
    Found by: stack scanning
25  wpa_supplicant!_init + 0x1c08
    rsp = 0x00007ffc1fb800a0   rip = 0x000057ddd19d8b80
    Found by: stack scanning

Loaded modules:
0x57ddd19c6000 - 0x57ddd1aacfff  wpa_supplicant  ???  (main)
0x79713bee6000 - 0x79713befbfff  libgcc_s.so.1  ???
0x79713c0fd000 - 0x79713c201fff  libm-2.23.so  ???
0x79713c403000 - 0x79713c419fff  libpthread-2.23.so  ???
0x79713c620000 - 0x79713c622fff  libdl-2.23.so  ???
0x79713c824000 - 0x79713c9c4fff  libc-2.23.so  ???
0x79713cbcf000 - 0x79713cdb3fff  libcrypto.so.1.0.0  ???
0x79713cddf000 - 0x79713cde5fff  librt-2.23.so  ???
0x79713cfe7000 - 0x79713d00afff  ld-2.23.so  ???
0x79713d0b0000 - 0x79713d0befff  pkcs11.so  ???
0x79713d0c6000 - 0x79713d0c9fff  libattr.so.1.1.0  ???
0x79713d0cc000 - 0x79713d0e1fff  libz.so.1.2.11  ???
0x79713d0e7000 - 0x79713d0ebfff  libcap.so.2.24  ???
0x79713d0ee000 - 0x79713d132fff  libdbus-1.so.3.14.8  ???  (WARNING: No symbols, libdbus-1.so.3.14.8, 5C540FE403A631453D6A83E57E1B64BB0)
0x79713d137000 - 0x79713d190fff  libssl.so.1.0.0  ???
0x79713d19d000 - 0x79713d1dcfff  libnl.so.1.1  ???
0x79713d1f1000 - 0x79713d203fff  libminijailpreload.so  ???
0x7ffc1fbcd000 - 0x7ffc1fbcefff  linux-gate.so  ???


 
And a shill crash follows the wpa_supplicant crash:

2017-07-25T00:36:00.951493-07:00 DEBUG shill[8045]: [VERBOSE2:chromeos_supplicant_process_proxy.cc(119)] /fi/w1/wpa_supplicant1 CreateInterface
2017-07-25T00:36:01.183301-07:00 ERR shill[8045]: [ERROR:object_proxy.cc(582)] Failed to call method: fi.w1.wpa_supplicant1.CreateInterface: object_path= /fi/w1/wpa_supplicant1: org.freedesktop.DBus.Error.NoReply: Message recipient disconnected from message bus without replying
2017-07-25T00:36:01.183417-07:00 ERR shill[8045]: [ERROR:chromeos_supplicant_process_proxy.cc(130)] Failed to create interface: org.freedesktop.DBus.Error.NoReply Message recipient disconnected from message bus without replying
2017-07-25T00:36:01.183466-07:00 DEBUG shill[8045]: [VERBOSE2:chromeos_supplicant_process_proxy.cc(159)] /fi/w1/wpa_supplicant1 GetInterface: wlan0
2017-07-25T00:36:01.184553-07:00 ERR shill[8045]: [ERROR:object_proxy.cc(582)] Failed to call method: fi.w1.wpa_supplicant1.GetInterface: object_path= /fi/w1/wpa_supplicant1: org.freedesktop.DBus.Error.ServiceUnknown: The name fi.w1.wpa_supplicant1 was not provided by any .service files
2017-07-25T00:36:01.184619-07:00 ERR shill[8045]: [ERROR:dbus_method_invoker.h(111)] CallMethodAndBlockWithTimeout(...): Domain=dbus, Code=org.freedesktop.DBus.Error.ServiceUnknown, Message=The name fi.w1.wpa_supplicant1 was not provided by any .service files
2017-07-25T00:36:01.185254-07:00 CRIT shill[8045]: [FATAL:chromeos_supplicant_process_proxy.cc(168)] Failed to get interface wlan0: org.freedesktop.DBus.Error.ServiceUnknown The name fi.w1.wpa_supplicant1 was not provided by any .service files#012/usr/lib64/libbase-core-395517.so(base::debug::StackTrace::StackTrace()+0x13) [0x7bb62c4a2ba3]#012
@benchan - which autotests are you running exactly?
Can you please try the test in a test lab and share the results?
Owner: benchan@chromium.org
Status: Assigned (was: Untriaged)
Hmm, network_WiFi_SimpleConnect.wifi_check1x_WPA works on a lab device with R62-9776.0.0:

cros flash root@chromeos9-row4-rack1-host1.cros xbuddy://remote/samus/R62-9776.0.0/test
test_that --board=samus --fast chromeos9-row4-rack1-host1.cros network_WiFi_SimpleConnect.wifi_check1x_WPA
...
----------------------------------------------------------------------------------------------------
network_WiFi_SimpleConnect.wifi_check1x_WPA                                             [  PASSED  ]
network_WiFi_SimpleConnect.wifi_check1x_WPA/network_WiFi_SimpleConnect.wifi_check1x_WPA [  PASSED  ]
----------------------------------------------------------------------------------------------------

Can you try again after a repo sync, and rebuilding everything?
What exact tests are you running?
The supplicant crash can be reproduced as follows. Note that even through a cellular test is used, it can run on a non-cellular DUT. The test simply queries shill if a cellular modem is found. The supplicant crash seems unrelated to the cellular test:

   cros flash $DUT xbuddy://remote/samus/R62-9778.0.0/test
   test_that --fast $DUT network_3GModemPresent

Attached is the supplicant log
wpa.log
28.3 KB View Download
FYI, the supplicant crash is no longer observed after switching back from libp11 to engine_pkcs11.

Comment 7 by kirtika@google.com, Jul 26 2017

Sorry for the delayed update. I ran on my Eve with 9777.0.0 and cannot reproduce the crash with network_3GModemPresent. Below is the test log, I do not see anything with "grep -i crash net.log" or "ls /var/spool/crash" or in dmesg. 


21:36:44 INFO | autoserv| Got lock of exit_code_file.
21:36:44 INFO | autoserv| Released lock of exit_code_file and closed it.
21:36:46 INFO | autoserv| Killing child processes.
21:36:46 INFO | autoserv| Client complete
21:36:47 INFO | autoserv| Finished processing control file
21:36:47 INFO | autoserv| Skipping crash dump/info collection as requested.
21:36:47 INFO | autoserv| get_network_stats: at-end RXbytes 12593956 TXbytes 2151335
21:36:48 INFO | autoserv| record_state_duration failed: job_or_task_id=None, hostname=172.22.30.198, status=Running
-------------------------------------------------------------------------------------------------
/tmp/test_that_results_4iIDv1/results-1-network_3GModemPresent                        [  FAILED  ]
/tmp/test_that_results_4iIDv1/results-1-network_3GModemPresent                          FAIL: Unhandled DBusException: org.freedesktop.DBus.Error.NoReply: Message recipient disconnected from message bus without replying
/tmp/test_that_results_4iIDv1/results-1-network_3GModemPresent/network_3GModemPresent [  FAILED  ]
/tmp/test_that_results_4iIDv1/results-1-network_3GModemPresent/network_3GModemPresent   FAIL: Unhandled DBusException: org.freedesktop.DBus.Error.NoReply: Message recipient disconnected from message bus without replying
/tmp/test_that_results_4iIDv1/results-1-network_3GModemPresent/network_3GModemPresent   07/25 19:36:41.234 ERROR|  test_environment:0133| Error during test initialization:
/tmp/test_that_results_4iIDv1/results-1-network_3GModemPresent/network_3GModemPresent   Traceback (most recent call last):
/tmp/test_that_results_4iIDv1/results-1-network_3GModemPresent/network_3GModemPresent     File "/usr/local/autotest/cros/cellular/test_environment.py", line 117, in __enter__
/tmp/test_that_results_4iIDv1/results-1-network_3GModemPresent/network_3GModemPresent       self._verify_sim()
/tmp/test_that_results_4iIDv1/results-1-network_3GModemPresent/network_3GModemPresent     File "/usr/local/autotest/cros/cellular/test_environment.py", line 244, in _verify_sim
/tmp/test_that_results_4iIDv1/results-1-network_3GModemPresent/network_3GModemPresent       modem_device = self._get_shill_cellular_device_object()
/tmp/test_that_results_4iIDv1/results-1-network_3GModemPresent/network_3GModemPresent     File "/usr/local/autotest/cros/cellular/test_environment.py", line 154, in _get_shill_cellular_device_object
/tmp/test_that_results_4iIDv1/results-1-network_3GModemPresent/network_3GModemPresent       raise error.TestError('Cannot find cellular device in shill. '
/tmp/test_that_results_4iIDv1/results-1-network_3GModemPresent/network_3GModemPresent   TestError: Cannot find cellular device in shill. Is the modem plugged in?
-------------------------------------------------------------------------------------------------
Total PASS: 0/2 (0%)

21:36:49 INFO | Finished running tests. Results can be found in /tmp/test_that_results_4iIDv1 or /tmp/test_that_latest

Comment 8 by kirtika@google.com, Jul 26 2017

#c7 is wrong and I stand corrected. The crash dumps are there, just not in /var/spool/crash but in /usr/local/autotest/debug hierarchy, working on symbolizing those. In the meanwhile, can we consider reverting the libp11 CL?

ben was going to check if there was a simpler command line or specific set of dbus calls that can repro this.
I spent a long time looking into this today, but I ran out of time before finding a root cause.

For now I recommend we just revert back to the previous libp11 (engine_p11) by reverting:
https://chromium-review.googlesource.com/#/c/569904/
https://chromium-review.googlesource.com/#/c/569905/
https://chromium-review.googlesource.com/#/c/570142/

I was able to reproduce the supplicant crash by enabling/disabling WiFi through shill a few times:

To do that programmatically:

  dbus-send --print-reply --system --dest=org.chromium.flimflam /device/wlan0 org.chromium.flimflam.Device.Disable
  dbus-send --print-reply --system --dest=org.chromium.flimflam /device/wlan0 org.chromium.flimflam.Device.Enable

Repeat the commands above until one of the them fails as follows:

  Error org.freedesktop.DBus.Error.NoReply: Message recipient disconnected from message bus without replying

Then look for the supplicant and shill crashes under /var/spool/crash

We can probably add a stress enable/disable test for shill.
Project Member

Comment 12 by bugdroid1@chromium.org, Jul 27 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/aosp/platform/system/connectivity/shill/+/2dd11031ebbae69834105d41d686dce8f91d7d8d

commit 2dd11031ebbae69834105d41d686dce8f91d7d8d
Author: Kirtika Ruchandani <kirtika@chromium.org>
Date: Thu Jul 27 05:41:47 2017

Revert "shill: update wpa_supplicant.conf for libp11 0.4.0+"

This reverts commit d4b1f672b11ae58c36fb4588d8853996b0333966.
This commit causes a crash in wpa-supplicant and shill, which can be
reliably reproduced by running network_3GModemPresent on any board.

CQ-DEPEND=CL:587370
BUG=chromium:748407
TEST=run network_3GModemPresent on eve, check for no supplicant crash

Change-Id: If9af0077cd58647b87c59e8e035c79bc608086b9
Reviewed-on: https://chromium-review.googlesource.com/587328
Commit-Ready: Kirtika Ruchandani <kirtika@chromium.org>
Tested-by: Kirtika Ruchandani <kirtika@chromium.org>
Reviewed-by: Ben Chan <benchan@chromium.org>

[modify] https://crrev.com/2dd11031ebbae69834105d41d686dce8f91d7d8d/shims/wpa_supplicant.conf.in

Project Member

Comment 13 by bugdroid1@chromium.org, Jul 27 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/dc916c3bdfe1d569b8cdb1ed4bc56da4700accf4

commit dc916c3bdfe1d569b8cdb1ed4bc56da4700accf4
Author: Kirtika Ruchandani <kirtika@chromium.org>
Date: Thu Jul 27 05:41:46 2017

Revert "net-wireless/wpa_supplicant: engine_pkcs11 -> libp11"

This reverts commit d42e9591c3271da8f7fef3edd357cf6272657777.
This commit causes a crash in wpa-supplicant and shill, which can be
reliably reproduced by running network_3GModemPresent on any board.

CQ-DEPEND=CL:587328
BUG=chromium:748407
TEST=run network_3GModemPresent on eve, check for no supplicant crash

Change-Id: I69d0de711b157f5b2d845b3ca0465044f2cb28a5
Reviewed-on: https://chromium-review.googlesource.com/587249
Commit-Ready: Kirtika Ruchandani <kirtika@chromium.org>
Tested-by: Kirtika Ruchandani <kirtika@chromium.org>
Reviewed-by: Ben Chan <benchan@chromium.org>

[modify] https://crrev.com/dc916c3bdfe1d569b8cdb1ed4bc56da4700accf4/net-wireless/wpa_supplicant/wpa_supplicant-9999.ebuild

Project Member

Comment 14 by bugdroid1@chromium.org, Jul 27 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/overlays/portage-stable/+/3b618dd9199b8cc75ca869d4246cfe2da7a50684

commit 3b618dd9199b8cc75ca869d4246cfe2da7a50684
Author: Kirtika Ruchandani <kirtika@chromium.org>
Date: Thu Jul 27 05:41:47 2017

Revert "dev-libs/libp11 & engine_pkcs11: Update to latest from gentoo"

This reverts commit 31736c26284898a69357bc9cfba7011a328154ff.

This commit causes a crash in wpa-supplicant and shill, which can be
reliably reproduced by running network_3GModemPresent on any board.

CQ-DEPEND=CL:587249
BUG=chromium:748407
TEST=run network_3GModemPresent on eve, check for no supplicant crash

Change-Id: I788d2965d96f8d7fddf6063da92776c865474ecf
Reviewed-on: https://chromium-review.googlesource.com/587370
Commit-Ready: Kirtika Ruchandani <kirtika@chromium.org>
Tested-by: Kirtika Ruchandani <kirtika@chromium.org>
Reviewed-by: Ben Chan <benchan@chromium.org>

[add] https://crrev.com/3b618dd9199b8cc75ca869d4246cfe2da7a50684/dev-libs/engine_pkcs11/Manifest
[modify] https://crrev.com/3b618dd9199b8cc75ca869d4246cfe2da7a50684/dev-libs/libp11/metadata.xml
[add] https://crrev.com/3b618dd9199b8cc75ca869d4246cfe2da7a50684/dev-libs/engine_pkcs11/engine_pkcs11-0.1.8-r1.ebuild
[modify] https://crrev.com/3b618dd9199b8cc75ca869d4246cfe2da7a50684/dev-libs/libp11/Manifest
[add] https://crrev.com/3b618dd9199b8cc75ca869d4246cfe2da7a50684/dev-libs/libp11/files/libp11-0.2.8-no-ltdl.patch
[delete] https://crrev.com/8355664c39ee36822c44fbd32946a0a67a1d6553/dev-libs/libp11/libp11-0.4.4.ebuild
[add] https://crrev.com/3b618dd9199b8cc75ca869d4246cfe2da7a50684/dev-libs/libp11/files/libp11-0.2.8-variable-buffer-size.patch
[add] https://crrev.com/3b618dd9199b8cc75ca869d4246cfe2da7a50684/dev-libs/engine_pkcs11/metadata.xml
[add] https://crrev.com/3b618dd9199b8cc75ca869d4246cfe2da7a50684/dev-libs/libp11/libp11-0.2.8-r3.ebuild
[add] https://crrev.com/3b618dd9199b8cc75ca869d4246cfe2da7a50684/dev-libs/engine_pkcs11/engine_pkcs11-0.1.8.ebuild

Blocking: 739688
Cc: mnissler@chromium.org jorgelo@chromium.org benchan@chromium.org
Owner: djkurtz@chromium.org
Status: Started (was: Assigned)
I just wrote up a painful long detailed explanation of what is happening here, but it got erased before I hit send, so I'll try to summarize again...

wpa_supplicant uses the "dynamic" engine to load pkcs11.so.  The dynamic engine uses a static variable to store an "ex_data" id for storing a context pointer.  The dynamic engine registers a "free" function with this ID to unload any engines that it has dynamically loaded.  The dynamically loaded pkcs11 engine also uses a static variable to store its "ex_data" id for storing its own context pointer.  The IDs are from the same CRYPTO_EX_INDEX_ENGINE namespace, thus for wpa_supplicant's first TLS session, the dynamic ID = 0, pkcs11 ID = 1.

Now, when the TLS session ends, tls_deinit() calls ENGINE_cleanup(), which calls the dynamic engine's "ex_data ID free routine" which in turn dlcloses() pkcs11.so.  However, OpenSSL and the dynamic engine are themselves not unloaded, so the dynamic engine still uses the same "ex_data ID".  After ENGINE_cleanup(), tls_deinit() then calls CRYPTO_cleanup_all_ex_data(), which resets all of the "ex_data" classes, invalidating the dynamic engines's ID.

On the next TLS session, the dynamic engine's static id variable still has the last (stale) ID (0).  However, it no longer has a registered free() function.  Furthermore, when pkcs11 is dynamically loaded, its static variable has been reset, so this time, it fetches a new ID - but since dynamic didn't go first, the ID returned is (0) - the same as is being (re-)used by dynamic.  Essentially, the dynamic and pkcs11 engines are both using the exact same memory for their contexts.  

When the second TLS session ends, since the dynamic engine failed to register a free() function, the pkcs11.so module is NOT dlclosed(), and hence its static variable is not reset.

When the third TLS session, starts both dynamic and pkcs11 engines again use the same ID (0), and in fact use the same memory for their contexts.  When the third TLS session ends, both engines end up free'ing the same pointer: dynamic thinks it is free'ing DYNAMIC_LIBNAME, and pkcs11.so thinks its free'ing module.  This leads to a double free / crash.


AFAICT, the right way to fix this is for the dynamic engine to reset dynamic_ex_data_idx in ENGINE_cleanup() (before CRYPTO_cleanup_all_ex_data), perhaps in a new "destroy" callback.
To follow up here, the reason this is not a problem with upstream wpa_supplicant when using openssl 1.1 & upstream libp11, is this wpa_supplicant patch:

29bc76e3d3252dfc311d1ffaa15be66abd8509ee
OpenSSL: Do not use library init/deinit functions with 1.1.0

The above patch comments out the call to CRYPTO_cleanup_all_ex_data() in tls_deinit() when using OPENSSL_VERSION_NUMBER >= 0x10100000L.

So, we could just defer and update pkcs11 and wpa_supplicant and openssl-1.1 atomically.
Staying closer to upstream SGTM.

Comment 19 Deleted

Comment 20 Deleted

Sign in to add a comment