New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 781552 link

Starred by 1 user

Issue metadata

Status: Untriaged
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

crash_reporter: Error reading udev log info crash_reporter-udev-collection-change-card0-drm

Project Member Reported by drinkcat@chromium.org, Nov 4 2017

Issue description

 - poppy-release-R64-10088.0.0-b16887
 - Test result here: http://cautotest/tko/retrieve_logs.cgi?job=/results/153246589-drinkcat/
    => chromeos2-row6-rack4-host19/sysinfo/messages

After a i915 drm crash, it seems crash_reporter has issue fetching udev logs (not sure if that's always the case, or just for i915 crashes):

2017-11-01T17:01:42.984064-07:00 INFO kernel: [ 1733.011878] [drm] GPU HANG: ecode 9:2:0xa8dfbffd, in chrome [13446], reason: Hang on bsd ring, action: reset
2017-11-01T17:01:42.984104-07:00 INFO kernel: [ 1733.011889] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
2017-11-01T17:01:42.984107-07:00 INFO kernel: [ 1733.011894] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
2017-11-01T17:01:42.984109-07:00 INFO kernel: [ 1733.011899] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
2017-11-01T17:01:42.984110-07:00 INFO kernel: [ 1733.011904] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
2017-11-01T17:01:42.984112-07:00 INFO kernel: [ 1733.011910] [drm] GPU crash dump saved to /sys/class/drm/card0/error
2017-11-01T17:01:42.984113-07:00 NOTICE kernel: [ 1733.011976] drm/i915: Resetting chip after gpu hang
2017-11-01T17:01:42.987013-07:00 INFO kernel: [ 1733.014496] [drm] RC6 on
2017-11-01T17:01:42.988348-07:00 INFO kernel: [ 1733.015587] [drm] No GuC firmware known for this platform
2017-11-01T17:01:42.988365-07:00 INFO kernel: [ 1733.015608] [drm] GuC firmware load failed: -19
2017-11-01T17:01:43.130843-07:00 INFO crash_reporter[18470]: libminijail[18470]: mount /dev/log -> /dev/log type ''
2017-11-01T17:01:43.189073-07:00 INFO crash_reporter[18470]: developer image - collect udev crash info.
2017-11-01T17:01:50.289438-07:00 WARNING kernel: [ 1740.309482] show_signal_msg: 273 callbacks suppressed
2017-11-01T17:01:50.289463-07:00 INFO kernel: [ 1740.309497] Watchdog[13458]: segfault at 0 ip 000056606fdf00f7 sp 00007932796649a0 error 6 in chrome[56606c37a000+8118000]
2017-11-01T17:01:50.306203-07:00 INFO crash_reporter[18488]: libminijail[18488]: mount /dev/log -> /dev/log type ''
2017-11-01T17:01:50.376248-07:00 WARNING crash_reporter[18488]: [user] Received crash notification for chrome[13446] sig 11, user 1000 (developer build - not testing - always dumping)
2017-11-01T17:01:50.389859-07:00 INFO crash_reporter[18488]: State of crashed process [13446]: D (disk sleep)
2017-11-01T17:01:50.404888-07:00 WARNING crash_reporter[18470]: Log command "  echo "===i915/parameters===";   grep '' /sys/module/i915/parameters/* |     sed -e 's!^/sys/module/i915/parameters/!!';   for dri in /sys/kernel/debug/dri/*; do     echo "===$dri/i915_error_state===";     cat $dri/i915_error_state;     echo "===$dri/i915_capabilities===";     cat $dri/i915_capabilities;     echo "===$dri/i915_wa_registers===";     cat $dri/i915_wa_registers;   done" exited with 1
>>>>>>>>>>>>>>>>>>>
2017-11-01T17:01:50.404989-07:00 ERR crash_reporter[18470]: Error reading udev log info crash_reporter-udev-collection-change-card0-drm
<<<<<<<<<<<<<<<<<<<
2017-11-01T17:01:50.420795-07:00 WARNING kernel: [ 1740.443093] udevd[18469]: Process '/sbin/crash_reporter --udev=KERNEL=card0:SUBSYSTEM=drm:ACTION=change' failed with exit code 1.

 
if the last cat command failed, that'd bubble up as an overall failure.  we probably should add some EOF markers in this like other collectors.

then again, i'm not sure there's any value in tracking the final exit status of any of these collectors.  seems like we should just note the final exit status at the end, but still include the (at least partial) output.

https://chromium-review.googlesource.com/755055 should do it, and then you guys can look at updated crash reports to see what further changes (if any) you want to make to this particular collector.
Project Member

Comment 2 by bugdroid1@chromium.org, Nov 9 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform2/+/c5022dd5994fe1206b076e84435c8aa59f61ca55

commit c5022dd5994fe1206b076e84435c8aa59f61ca55
Author: Mike Frysinger <vapier@chromium.org>
Date: Thu Nov 09 08:38:11 2017

crash: always include additional log output

Registered script commands might produce some useful output but then
ultimately exit(1) for a variety of reasons.  Rather than throw away
all of the output and ignore the results, always include the output,
and just add a note that the script overall failed.

In the case of the drm collector, add a final EOF echo statement so
we reset any exit(1) statuses from grep commands.

BUG=chromium:781552
TEST=precq passes

Change-Id: If02c25ad026dfa01b492cb03291d0896f686e9ba
Reviewed-on: https://chromium-review.googlesource.com/755055
Commit-Ready: Mike Frysinger <vapier@chromium.org>
Tested-by: Mike Frysinger <vapier@chromium.org>
Reviewed-by: Ben Chan <benchan@chromium.org>

[modify] https://crrev.com/c5022dd5994fe1206b076e84435c8aa59f61ca55/crash-reporter/crash_reporter_logs.conf
[modify] https://crrev.com/c5022dd5994fe1206b076e84435c8aa59f61ca55/crash-reporter/crash_collector.cc

Sign in to add a comment