New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 752445 link

Starred by 1 user

Issue metadata

Status: Assigned
Owner:
Last visit > 30 days ago
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

Suspicious stack overflow in eve EC firmware

Project Member Reported by cheyuw@google.com, Aug 4 2017

Issue description

The following are suspicious stack overflows in the tasks of eve.
They are reported by an experimental stack analysis tool.

For more details about these call traces, here are the steps to run the tool:
1. Apply CL https://chromium-review.googlesource.com/c/601574/1
2. make BOARD=eve -j && make BOARD=eve SECTION=RW analyzestack


All "i2c_xfer -> cprints" are actually "i2c_xfer -> chip_i2c_xfer -> cprints", but optimized by inlining.
The trace of PD_C0 and PD_C1 is removed invalid state transition path by hands.
The "uint64divmod" is possible to be called since "cprints" contains timestamp format string "%T".

Task: CHG_RAMP, Max size: 732 (508 + 224) > 512
chg_ramp_task (40) -> charge_manager_save_log.part.2.lto_priv.312 (56) -> charge_manager_fill_power_info (48) -> charger_get_vbus_voltage (16) -> ch_raw_read16.lto_priv.177 (24) -> i2c_read16 (48) -> i2c_xfer (56) -> cprints (40) -> cprintf (32) -> vfnprintf (112) -> uint64divmod.part.3.lto_priv.148 (36)

Task: USB_CHG, Max size: 652 (428 + 224) > 512
usb_charger_task (56) -> bd9995x_bc12_enable_charging (24) -> ch_raw_read16.lto_priv.177 (24) -> i2c_read16 (48) -> i2c_xfer (56) -> cprints (40) -> cprintf (32) -> vfnprintf (112) -> uint64divmod.part.3.lto_priv.148 (36)

Task: CHARGER, Max size: 756 (532 + 224) > 640
charger_task (56) -> chipset_force_shutdown (8) -> set_pwrbtn_to_pch.lto_priv.147 (16) -> charge_prevent_power_on (56) -> battery_get_params (56) -> battery_is_present (24) -> sb_read (0) -> i2c_read16 (48) -> i2c_xfer (56) -> cprints (40) -> cprintf (32) -> vfnprintf (112) -> uint64divmod.part.3.lto_priv.148 (36)

Task: CHIPSET, Max size: 732 (508 + 224) > 640
chipset_task (24) -> power_button_pch_release (8) -> set_pwrbtn_to_pch.lto_priv.147 (16) -> charge_prevent_power_on (56) -> battery_get_params (56) -> battery_is_present (24) -> sb_read (0) -> i2c_read16 (48) -> i2c_xfer (56) -> cprints (40) -> cprintf (32) -> vfnprintf (112) -> uint64divmod.part.3.lto_priv.148 (36)

Task: KEYPROTO, Max size: 516 (292 + 224) > 512
keyboard_protocol_task (40) -> update_ctl_ram.lto_priv.359 (16) -> keyboard_enable.lto_priv.360 (16) -> cprints (40) -> cprintf (32) -> vfnprintf (112) -> uint64divmod.part.3.lto_priv.148 (36)

Task: POWERBTN, Max size: 740 (516 + 224) > 640
power_button_task (40) -> set_pwrbtn_to_pch.lto_priv.147 (16) -> charge_prevent_power_on (56) -> battery_get_params (56) -> battery_is_present (24) -> sb_read (0) -> i2c_read16 (48) -> i2c_xfer (56) -> cprints (40) -> cprintf (32) -> vfnprintf (112) -> uint64divmod.part.3.lto_priv.148 (36)

Task: PD_C0 and PD_C1, Max size: 940 (716 + 224) > 640
pd_task (160) -> set_state.lto_priv.248 (40) -> pd_power_supply_reset (16) -> charge_manager_source_port (32) -> charge_manager_save_log.part.2.lto_priv.312 (56) -> charge_manager_fill_power_info (48) -> charger_get_vbus_voltage (16) -> ch_raw_read16.lto_priv.177 (24) -> i2c_read16 (48) -> i2c_xfer (56) -> cprints (40) -> cprintf (32) -> vfnprintf (112) -> uint64divmod.part.3.lto_priv.148 (36)


Expected result:
Make the stack size of tasks large enough.
 

Comment 1 by cheyuw@google.com, Aug 4 2017

Components: OS>Firmware>EC

Comment 2 by cheyuw@google.com, Aug 4 2017

Cc: wnhuang@chromium.org drinkcat@chromium.org
Summary: Suspicious stack overflow in eve EC firmware (was: Suspicious stack overflow on eve EC firmware)

Comment 3 by cheyuw@google.com, Aug 4 2017

Owner: cheyuw@google.com
Cc: vpalatin@chromium.org scollyer@chromium.org dlaurie@chromium.org
Labels: -Pri-3 OS-Chrome Pri-1
We should remove the cprints in chip/npcx/i2c.c as it's seldom a good idea to flood the console with traces in case of I2C issue (as we are trying a lot of i2c transfers), and do a lighterweight version of the i2c_unwedge code as it is costing too much right now.
We would get it of most of this, the remaining one is likely the PD tasks, and the unwedge code is still the worst offender (maybe we should look at doing it in a hook)

--- a/chip/npcx/i2c.c
+++ b/chip/npcx/i2c.c
@@ -258,7 +258,7 @@ enum smb_error i2c_master_transaction(int controller)
                        /* Need to read the other bytes from next transaction */
                        p_status->oper_state = SMB_READ_OPER;
        } else
-               cprints(CC_I2C, "Unexpected i2c state machine! %d",
+               CPRINTS(CC_I2C, "Unexpected i2c state machine! %d",
                                p_status->oper_state);
 
        /* Generate a START condition */
@@ -298,7 +298,7 @@ enum smb_error i2c_master_transaction(int controller)
        /* Wait till STOP condition is generated for normal transaction */
        if (p_status->err_code == SMB_OK && i2c_wait_stop_completed(controller,
                        I2C_MIN_TIMEOUT) != EC_SUCCESS) {
-               cprints(CC_I2C, "STOP fail! scl %02x is held by slave device!",
+               CPRINTS(CC_I2C, "STOP fail! scl %02x is held by slave device!",
                                controller);
                p_status->err_code = SMB_TIMEOUT_ERROR;
        }
--- a/common/i2c_master.c
+++ b/common/i2c_master.c
@@ -398,7 +398,7 @@ int i2c_unwedge(int port)
                                 * If we get here, a slave is holding the clock
                                 * low and there is nothing we can do.
                                 */
-                               CPRINTS("I2C unwedge failed, SCL is being held low");
+                               CPUTS("I2C unwedge failed, SCL is being held low");
                                ret = EC_ERROR_UNKNOWN;
                                goto unwedge_done;
                        }
@@ -411,7 +411,7 @@ int i2c_unwedge(int port)
        if (i2c_raw_get_sda(port))
                goto unwedge_done;
 
-       CPRINTS("I2C unwedge called with SDA held low");
+       CPUTS("I2C unwedge called with SDA held low");
 
        /* Keep trying to unwedge the SDA line until we run out of attempts. */
        for (i = 0; i < UNWEDGE_SDA_ATTEMPTS; i++) {
@@ -446,11 +446,11 @@ int i2c_unwedge(int port)
        }
 
        if (!i2c_raw_get_sda(port)) {
-               CPRINTS("I2C unwedge failed, SDA still low");
+               CPUTS("I2C unwedge failed, SDA still low");
                ret = EC_ERROR_UNKNOWN;
        }
        if (!i2c_raw_get_scl(port)) {
-               CPRINTS("I2C unwedge failed, SCL still low");
+               CPUTS("I2C unwedge failed, SCL still low");
                ret = EC_ERROR_UNKNOWN;
        }
 

Cc: furquan@chromium.org
Owner: scollyer@chromium.org
(we should now be able to reproduce this issue by following instructions in https://sites.google.com/a/chromium.org/dev/chromium-os/ec-development/stack-size-analyzer )

Comment 7 Deleted

Comment 8 Deleted

Status: Assigned (was: Untriaged)

Sign in to add a comment