Suspicious stack overflow in eve EC firmware |
||||||
Issue descriptionThe following are suspicious stack overflows in the tasks of eve. They are reported by an experimental stack analysis tool. For more details about these call traces, here are the steps to run the tool: 1. Apply CL https://chromium-review.googlesource.com/c/601574/1 2. make BOARD=eve -j && make BOARD=eve SECTION=RW analyzestack All "i2c_xfer -> cprints" are actually "i2c_xfer -> chip_i2c_xfer -> cprints", but optimized by inlining. The trace of PD_C0 and PD_C1 is removed invalid state transition path by hands. The "uint64divmod" is possible to be called since "cprints" contains timestamp format string "%T". Task: CHG_RAMP, Max size: 732 (508 + 224) > 512 chg_ramp_task (40) -> charge_manager_save_log.part.2.lto_priv.312 (56) -> charge_manager_fill_power_info (48) -> charger_get_vbus_voltage (16) -> ch_raw_read16.lto_priv.177 (24) -> i2c_read16 (48) -> i2c_xfer (56) -> cprints (40) -> cprintf (32) -> vfnprintf (112) -> uint64divmod.part.3.lto_priv.148 (36) Task: USB_CHG, Max size: 652 (428 + 224) > 512 usb_charger_task (56) -> bd9995x_bc12_enable_charging (24) -> ch_raw_read16.lto_priv.177 (24) -> i2c_read16 (48) -> i2c_xfer (56) -> cprints (40) -> cprintf (32) -> vfnprintf (112) -> uint64divmod.part.3.lto_priv.148 (36) Task: CHARGER, Max size: 756 (532 + 224) > 640 charger_task (56) -> chipset_force_shutdown (8) -> set_pwrbtn_to_pch.lto_priv.147 (16) -> charge_prevent_power_on (56) -> battery_get_params (56) -> battery_is_present (24) -> sb_read (0) -> i2c_read16 (48) -> i2c_xfer (56) -> cprints (40) -> cprintf (32) -> vfnprintf (112) -> uint64divmod.part.3.lto_priv.148 (36) Task: CHIPSET, Max size: 732 (508 + 224) > 640 chipset_task (24) -> power_button_pch_release (8) -> set_pwrbtn_to_pch.lto_priv.147 (16) -> charge_prevent_power_on (56) -> battery_get_params (56) -> battery_is_present (24) -> sb_read (0) -> i2c_read16 (48) -> i2c_xfer (56) -> cprints (40) -> cprintf (32) -> vfnprintf (112) -> uint64divmod.part.3.lto_priv.148 (36) Task: KEYPROTO, Max size: 516 (292 + 224) > 512 keyboard_protocol_task (40) -> update_ctl_ram.lto_priv.359 (16) -> keyboard_enable.lto_priv.360 (16) -> cprints (40) -> cprintf (32) -> vfnprintf (112) -> uint64divmod.part.3.lto_priv.148 (36) Task: POWERBTN, Max size: 740 (516 + 224) > 640 power_button_task (40) -> set_pwrbtn_to_pch.lto_priv.147 (16) -> charge_prevent_power_on (56) -> battery_get_params (56) -> battery_is_present (24) -> sb_read (0) -> i2c_read16 (48) -> i2c_xfer (56) -> cprints (40) -> cprintf (32) -> vfnprintf (112) -> uint64divmod.part.3.lto_priv.148 (36) Task: PD_C0 and PD_C1, Max size: 940 (716 + 224) > 640 pd_task (160) -> set_state.lto_priv.248 (40) -> pd_power_supply_reset (16) -> charge_manager_source_port (32) -> charge_manager_save_log.part.2.lto_priv.312 (56) -> charge_manager_fill_power_info (48) -> charger_get_vbus_voltage (16) -> ch_raw_read16.lto_priv.177 (24) -> i2c_read16 (48) -> i2c_xfer (56) -> cprints (40) -> cprintf (32) -> vfnprintf (112) -> uint64divmod.part.3.lto_priv.148 (36) Expected result: Make the stack size of tasks large enough.
,
Aug 4 2017
,
Aug 4 2017
,
Aug 4 2017
,
Aug 4 2017
We should remove the cprints in chip/npcx/i2c.c as it's seldom a good idea to flood the console with traces in case of I2C issue (as we are trying a lot of i2c transfers), and do a lighterweight version of the i2c_unwedge code as it is costing too much right now.
We would get it of most of this, the remaining one is likely the PD tasks, and the unwedge code is still the worst offender (maybe we should look at doing it in a hook)
--- a/chip/npcx/i2c.c
+++ b/chip/npcx/i2c.c
@@ -258,7 +258,7 @@ enum smb_error i2c_master_transaction(int controller)
/* Need to read the other bytes from next transaction */
p_status->oper_state = SMB_READ_OPER;
} else
- cprints(CC_I2C, "Unexpected i2c state machine! %d",
+ CPRINTS(CC_I2C, "Unexpected i2c state machine! %d",
p_status->oper_state);
/* Generate a START condition */
@@ -298,7 +298,7 @@ enum smb_error i2c_master_transaction(int controller)
/* Wait till STOP condition is generated for normal transaction */
if (p_status->err_code == SMB_OK && i2c_wait_stop_completed(controller,
I2C_MIN_TIMEOUT) != EC_SUCCESS) {
- cprints(CC_I2C, "STOP fail! scl %02x is held by slave device!",
+ CPRINTS(CC_I2C, "STOP fail! scl %02x is held by slave device!",
controller);
p_status->err_code = SMB_TIMEOUT_ERROR;
}
--- a/common/i2c_master.c
+++ b/common/i2c_master.c
@@ -398,7 +398,7 @@ int i2c_unwedge(int port)
* If we get here, a slave is holding the clock
* low and there is nothing we can do.
*/
- CPRINTS("I2C unwedge failed, SCL is being held low");
+ CPUTS("I2C unwedge failed, SCL is being held low");
ret = EC_ERROR_UNKNOWN;
goto unwedge_done;
}
@@ -411,7 +411,7 @@ int i2c_unwedge(int port)
if (i2c_raw_get_sda(port))
goto unwedge_done;
- CPRINTS("I2C unwedge called with SDA held low");
+ CPUTS("I2C unwedge called with SDA held low");
/* Keep trying to unwedge the SDA line until we run out of attempts. */
for (i = 0; i < UNWEDGE_SDA_ATTEMPTS; i++) {
@@ -446,11 +446,11 @@ int i2c_unwedge(int port)
}
if (!i2c_raw_get_sda(port)) {
- CPRINTS("I2C unwedge failed, SDA still low");
+ CPUTS("I2C unwedge failed, SDA still low");
ret = EC_ERROR_UNKNOWN;
}
if (!i2c_raw_get_scl(port)) {
- CPRINTS("I2C unwedge failed, SCL still low");
+ CPUTS("I2C unwedge failed, SCL still low");
ret = EC_ERROR_UNKNOWN;
}
,
Sep 13 2017
(we should now be able to reproduce this issue by following instructions in https://sites.google.com/a/chromium.org/dev/chromium-os/ec-development/stack-size-analyzer )
,
Aug 1
|
||||||
►
Sign in to add a comment |
||||||
Comment 1 by cheyuw@google.com
, Aug 4 2017