autotest should log AP and EC uart |
|||||||||||||||||||||||
Issue descriptionAutotest currently performs most actions over network, and many issues simply have the symptom "didn't connect to network" However, there are a large variety of things that might prevent network, ranging from the machine's EC being bricked to incorrect autotest config. Logging EC and AP uart would provide extremely valuable data towards distinguishing these failure cases. "dut-control ec_uart_pty cpu_uart_pty" will show the pty devices that output this data.
,
Mar 8 2018
,
Mar 9 2018
Is this a request to stream or collect these logs continuously during an autotest run? Or to have them continuously streamed or collected to a file on disk? Or is it sufficient to run the command you suggest once after the test during "results collection" phase? Note that I'm not sure how this will help "didn't connect to network" issues as we will still only be able to ssh into the DUT and run this command if the device is on the network. -> nsanders to clarify
,
Mar 9 2018
These are USB -> serial interfaces logging serial output of the dut on the labstation. So this logs device state while network is up or down, or even when the device won't boot. The command above simply gives you the unix pty device associated with the serial port stream, like "/dev/pts/48" Autotest can save or stream this to a file in its own way
,
Mar 9 2018
So the request is to stream the output of running a command continuously to a log file on labstations?
,
Mar 9 2018
Yep, that's exactly it.
,
Mar 10 2018
I have no idea how labstations work, but I don't think this is trivial. kevcheng, are you familiar with the Autotest to labstation interface?
,
Mar 10 2018
Some sample code is here: https://cs.corp.google.com/chromeos_public/src/third_party/autotest/files/server/cros/faft/firmware_test.py?l=679 I don't think kevcheng is on chrome os anymore.
,
Mar 10 2018
Ah, okay, so the labstation/servo interface already supports it. It should just be a "small" matter of adding that code to autoserv somewhere. nsanders, do you know if the AP uart stream is the same (specifically, is it implemented)?
,
Mar 10 2018
AP uart is just another name for CPU uart, so it's implemented in the above code.
,
Mar 12 2018
Sounds potentially valuable. Worth a 1-or-2-pager design doc. nsansders@ Can you (or somebody on your team) collaborate with someone on infra to put together the doc / requirements? If so, I can shop the project around to find the infra partner.
,
Mar 12 2018
Sure, who on infra should I talk to?
,
Mar 12 2018
I need to figure that out / shop it around, which would be easier to do if the requirements doc was started.
,
Mar 12 2018
From my side the requirement would be: "logfiles containing the character output of servod's cpu_uart_pty and ec_uart_pty should be included with test runs." I'm not sure if this needs a design doc, since it's probably only a few lines of code copy-pasted from FAFT to the right location in autotest.
,
Mar 13 2018
There are several open questions (in my mind) which is why I suggest this needs a design doc: 1) Should this be enabled for all tests, or only some? 2) What to do for DUTs that do not have an associated labstation? 3) Does the code you linked to above require setup or cleanup work prior or after each test, or is it just a single call at the end of a test? 4) How much overhead will this add? 5) What to do in case of failure of the collection? How long to wait for it to succeed? 6) How big are these collected logs expected to be?
,
Mar 13 2018
I don't really have much insight into these questions, but I'd be happy to meet with someone on your team to provide context from the device side.
,
Mar 13 2018
There are several open questions (in my mind) which is why I suggest this needs a design doc: 1) Should this be enabled for all tests, or only some? Likely all, but special tasks would be a reasonable starting point. 2) What to do for DUTs that do not have an associated labstation? Well some devices have dedicated servos, those should work in the same way. Without any servo support, no logs. 3) Does the code you linked to above require setup or cleanup work prior or after each test, or is it just a single call at the end of a test? Likely a collection task will need to be started to spool the logs somewhere at the beginning of the test, and at the end of the test, those logs will need to be retrieved and uploaded to Google Storage. 4) How much overhead will this add? To the DUT, none. Not sure about to the gathering device, but likely not too much, but I could see some badly behaving devices could result in log spam and therefore both CPU and disk burn. 5) What to do in case of failure of the collection? How long to wait for it to succeed? There would be a failure to start the gathering process (think of it like doing a tail -f or nc piped to a file), at which point, I think there would just be a warning and move on. 6) How big are these collected logs expected to be? I'd say generally small to non-existent but in some problematic cases could be large.
,
Mar 16 2018
nxia@ will investigate a little bit
,
Apr 3 2018
How often is the AP firmware console enabled during the test? It is not enabled in prod firmware. Also, having serial interface on changes boot timing drastically, which could trigger boot duration test failures. Ideally AP firmware log should be collected after a failure to boot Chrome OS is encountered. At this point the DUT could be booted in recovery mode, and the last couple of AP firmware console cycles would be available in /sys/firmware/log. Often the failure to communicate happens because ethernet interface does not come up, which is a different story.
,
Apr 3 2018
to be clear, this bug is only requesting that autotest logs existing uart output, not any change to firmware behavior. Sys/firmware/log might already be logged, but if not, it would be good to add it.
,
May 12 2018
This bug is impacting Dru releases.
,
May 12 2018
,
May 12 2018
,
May 14 2018
,
May 14 2018
-> guocb can you take a look at this as a side project? Reach out to nsanders for clarifying requirements as necessary, and jrbarnette for background into on servov4.
,
May 23 2018
,
Jun 8 2018
,
Jun 26 2018
+mruthven, waihong for comment on whether logging in infra will affect logging in FAFT
,
Jun 26 2018
Faft tests already collect ec, cpu, and cr50 logs. This may interfere with collecting those logs. But if your proposal just attaches ec, cpu, and cr50 logs to all tests, then it can probably replace faft log collection.
,
Jun 26 2018
FAFT so far records these AP and EC console logs at the FAFT framework level. If the infra provides the similar console logging mechanism, FAFT can reuse it and not do a duplicated work.
,
Jun 30 2018
Any suggestion on which sub folder to put these uart log files?
,
Jul 2
,
Jul 16
I guess in servod logs folder?
,
Jul 25
,
Jul 26
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/a1f9cba4bf5663214902aa7b05d20cbff50109d3 commit a1f9cba4bf5663214902aa7b05d20cbff50109d3 Author: Congbin Guo <guocb@chromium.org> Date: Thu Jul 26 02:45:42 2018 [autotest] servo: capture CPU/EC UART. This change adds logs for CPU/EC UART streams. Logging starts at test start, and logs are saved when the test stops. TEST=Ran command test_that -b monroe -m monroe chromeos4-row13-rack12-host10 platform_LongPressPower and verified that two files, i.e. cpu_uart.log and ec_uart.log were generated in test results dir. BUG= chromium:819882 Change-Id: Ia75b290153597e0c0ef2c0e098092b5bc059b59b Reviewed-on: https://chromium-review.googlesource.com/1139213 Commit-Ready: Congbin Guo <guocb@chromium.org> Tested-by: Congbin Guo <guocb@chromium.org> Reviewed-by: Congbin Guo <guocb@chromium.org> [modify] https://crrev.com/a1f9cba4bf5663214902aa7b05d20cbff50109d3/server/hosts/servo_host.py [modify] https://crrev.com/a1f9cba4bf5663214902aa7b05d20cbff50109d3/server/cros/servo/servo.py
,
Aug 13
|
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by nsanders@chromium.org
, Mar 8 2018