Consider turning CBMEM console into a persistent ring buffer |
|||||
Issue descriptionIt seems to me that we keep running into situations where it would be really useful to grab the firmware log from one or two boots ago... such as when a dogfood system has gone into recovery mode but we want to know exactly why (and the recovery reason isn't enough). Also, I have recently been thinking about extending CBMEM console into ARM TF to log runtime issues, and logging a panic in there wouldn't be very useful if it gets immediately overwritten on reboot. Wouldn't it be much more useful if we considered the console a persistent ring buffer where we just keep appending? Even though CBMEM is laid out dynamically, in practice the area will almost always fall onto the same location. We could just do a quick sanity check on initialization to see if the existing header is valid (length within bounds and stuff, maybe a new magic number), and if it is append the current boot to the end of it. (For extra credit we could do the same thing with the pre-CBMEM SRAM buffer on non-x86 devices to also catch boot attempts that rebooted before CBMEM init.) We'd maybe need to talk to privacy about how much of that we can still include in the feedback report, but otherwise I don't see any problems this could cause. Anyone else?
,
Oct 20 2016
Assigning to Stefan for triage. I can take this on myself the next time I have time for something (which might be a while :/ ), but I'd be happy to advise/review someone else doing it, too (I don't think it would really be all that complicated). @Aaron: You already see where the device rebooted because it always starts at "\n\ncoreboot-XXX <stage> starting...", right? So you can just count the number of reboots backwards in both firmware log and syslog to correlate stuff. It might be nice to also print RTC timestamps in the firmware log in addition to that (e.g. we could just have elog_add_event_raw() print out the timestamp that it is fetching anyway).
,
Oct 20 2016
dates/times are super helpful in there for sure. on x86 resume we also go through the same paths. They only slightly differ towards the end of ramstage. My personal preference is to stitch dmesg and firmware logs together, but we've never managed to do that in Chrome OS.
,
Oct 20 2016
Right... for x86 I hope there's enough in the log to tell the difference between boot and resumes (maybe we can make it more obvious/consistent somehow). I think the stitching you describe could be attempted as a follow-up to this but is mostly independent. The persistent ring buffer would still be very useful to catch multiple reboots that didn't go all the way through to the kernel. And then you could maybe add an upstart task or modify rsyslogd to extract the log, inject it into the right place in syslog and clear the CBMEM buffer on every boot. Another option might be to delegate this to log analysis tools. The Android Bug Tool that now supports Chrome OS feedback reports already has this feature where it can combine multiple logs into a single timestamp-ordered view. Maybe we could teach it how to interleave the bios_log part of the report with the others.
,
Feb 6 2017
I took a stab at this now... patches are up ending in https://review.coreboot.org/#/c/18301/
,
Apr 21 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/coreboot/+/3465684df6240c0b023d7e5e75101257b0fa801e commit 3465684df6240c0b023d7e5e75101257b0fa801e Author: Julius Werner <jwerner@chromium.org> Date: Fri Apr 21 13:03:52 2017 UPSTREAM: Turn CBMEM console into a ring buffer that can persist across reboots This patch allows the CBMEM console to persist across reboots, which should greatly help post factum debugging of issues involving multiple reboots. In order to prevent the console from filling up, it will instead operate as a ring buffer that continues to evict the oldest lines once full. (This means that if even a single boot doesn't fit into the buffer, we will now drop the oldest lines whereas previous code would've dropped the newest lines instead.) The console control structure is modified in a sorta backwards-compatible way, so that new readers can continue to work with old console buffers and vice versa. When an old reader reads a new buffer that has already once overflowed (i.e. is operating in true ring buffer mode) it will print lines out of order, but it will at least still print out the whole console content and not do any illegal memory accesses (assuming it correctly implemented cursor overflow as it was already possible before this patch). BUG= chromium:651966 TEST=Rebooted and confirmed output repeatedly on a Kevin and a Falco. Also confirmed correct behavior across suspend/resume for the latter. Change-Id: I927df6bbffa13fdce9d1984278afa340042f7ac9 Signed-off-by: Patrick Georgi <pgeorgi@chromium.org> Original-Commit-Id: d67c6876b521d784e38ea91b5a1828a24c9cb1c8 Original-Change-Id: Ifcbf59d58e1ad20995b98d111c4647281fbb45ff Original-Signed-off-by: Julius Werner <jwerner@chromium.org> Original-Reviewed-on: https://review.coreboot.org/18301 Original-Tested-by: build bot (Jenkins) Original-Reviewed-by: Aaron Durbin <adurbin@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/482975 [modify] https://crrev.com/3465684df6240c0b023d7e5e75101257b0fa801e/src/lib/cbmem_console.c [modify] https://crrev.com/3465684df6240c0b023d7e5e75101257b0fa801e/payloads/coreinfo/bootlog_module.c [modify] https://crrev.com/3465684df6240c0b023d7e5e75101257b0fa801e/payloads/libpayload/drivers/cbmem_console.c [modify] https://crrev.com/3465684df6240c0b023d7e5e75101257b0fa801e/util/cbmem/cbmem.c
,
Apr 21 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/coreboot/+/3465684df6240c0b023d7e5e75101257b0fa801e commit 3465684df6240c0b023d7e5e75101257b0fa801e Author: Julius Werner <jwerner@chromium.org> Date: Fri Apr 21 13:03:52 2017 UPSTREAM: Turn CBMEM console into a ring buffer that can persist across reboots This patch allows the CBMEM console to persist across reboots, which should greatly help post factum debugging of issues involving multiple reboots. In order to prevent the console from filling up, it will instead operate as a ring buffer that continues to evict the oldest lines once full. (This means that if even a single boot doesn't fit into the buffer, we will now drop the oldest lines whereas previous code would've dropped the newest lines instead.) The console control structure is modified in a sorta backwards-compatible way, so that new readers can continue to work with old console buffers and vice versa. When an old reader reads a new buffer that has already once overflowed (i.e. is operating in true ring buffer mode) it will print lines out of order, but it will at least still print out the whole console content and not do any illegal memory accesses (assuming it correctly implemented cursor overflow as it was already possible before this patch). BUG= chromium:651966 TEST=Rebooted and confirmed output repeatedly on a Kevin and a Falco. Also confirmed correct behavior across suspend/resume for the latter. Change-Id: I927df6bbffa13fdce9d1984278afa340042f7ac9 Signed-off-by: Patrick Georgi <pgeorgi@chromium.org> Original-Commit-Id: d67c6876b521d784e38ea91b5a1828a24c9cb1c8 Original-Change-Id: Ifcbf59d58e1ad20995b98d111c4647281fbb45ff Original-Signed-off-by: Julius Werner <jwerner@chromium.org> Original-Reviewed-on: https://review.coreboot.org/18301 Original-Tested-by: build bot (Jenkins) Original-Reviewed-by: Aaron Durbin <adurbin@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/482975 [modify] https://crrev.com/3465684df6240c0b023d7e5e75101257b0fa801e/src/lib/cbmem_console.c [modify] https://crrev.com/3465684df6240c0b023d7e5e75101257b0fa801e/payloads/coreinfo/bootlog_module.c [modify] https://crrev.com/3465684df6240c0b023d7e5e75101257b0fa801e/payloads/libpayload/drivers/cbmem_console.c [modify] https://crrev.com/3465684df6240c0b023d7e5e75101257b0fa801e/util/cbmem/cbmem.c
,
Jun 20 2017
|
|||||
►
Sign in to add a comment |
|||||
Comment 1 by adurbin@chromium.org
, Oct 12 2016