New issue
Advanced search Search tips

Issue 904642 link

Starred by 2 users

Issue metadata

Status: Started
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Feature



Sign in to add a comment

Tast should symbolize crashes on VM builders

Project Member Reported by derat@chromium.org, Nov 13

Issue description

In the betty-pre-cq (VM) build at http://cros-goldeneye/chromeos/healthmonitoring/buildDetails?buildbucketId=8930108995398500768, ui.SingleProcessMashLogin timed out after what looks like a Chrome crash:

2018/11/11 18:37:27 Started test ui.SingleProcessMashLogin
2018/11/11 18:37:27 [18:37:27.710] Restarting ui job
2018/11/11 18:37:28 [18:37:28.475] Waiting for org.chromium.SessionManager D-Bus service
2018/11/11 18:37:28 [18:37:28.495] Asking session_manager to enable Chrome testing
2018/11/11 18:37:28 [18:37:28.498] Waiting for Chrome to write its debugging port to /home/chronos/DevToolsActivePort
2018/11/11 18:39:27 [18:39:27.710] Error at single_process_mash_login.go:26: Chrome login failed: failed to read Chrome debugging port: browser process 8944 replaced by 9154; Chrome probably crashed
...
2018/11/11 18:39:27 Completed test ui.SingleProcessMashLogin in 2m0.053s with 1 error(s)

It looks like there are some Chrome crash dumps at http://pantheon/storage/browser/chromeos-image-archive/betty-pre-cq/R72-11253.0.0-b3122994/tast_vm_test_results_1/tast_vm_paladin/crashes/ (the three files with generic GUIDs for names), but I don't know how to symbolize them -- I don't see a debug_breakpad.tar.xz file in https://pantheon.corp.google.com/storage/browser/chromeos-image-archive/betty-pre-cq/R72-11253.0.0-b3122994.

James, are you aware of any current mash segfaults that could've caused this?

Others, any ideas why we didn't get debug_breakpad.tar.xz for this betty-pre-cq build? I looked at a random old betty-release directory (https://pantheon.corp.google.com/storage/browser/chromeos-image-archive/betty-pre-cq/R72-11253.0.0-b3122994), and we saved a debug_breakpad.tar.xz file there. Do we deliberately skip storing these for pre-cq builds to save space?
 
by design, the precq does not upload its symbols to the crash server.  it'd be extremely wasteful of resources on our side as well as the crash server.

i think we also skip generating of the .sym files.  and prob creating of the .debug tarball.  neither step is fast (GB of data).

iirc, autotest should take care of doing the symbolification and such.  i guess tast needs to do that too ?
Cc: nya@chromium.org hidehiko@chromium.org
Tast knows how symbolize minidump files (see the "tast symbolize" command), but it can't do that if it doesn't have debug symbols.

It also doesn't do it automatically, since it's slow and network-intensive unless you ask a devserver to symbolize for you, and that won't work if you're on the corp network. It's also usually unnecessary, since Autotest symbolizes crash files after it runs Tast in the lab. Maybe Tast should do it itself on VM builders, which don't wrap Tast within Autotest, though.

Perhaps the answer here is that I should make "tast symbolize" support automatically using e.g. betty-release/R72-11253.0.0's symbols for betty-pre-cq/R72-11253.0.0-b3122994. I'll give that a try. No idea if it'll work for all packages, but maybe it'll at least work for Chrome?
(It doesn't seem to work for Chrome, at least in this case.)
you have no guarantee that symbol/debug info for any other release will line up with the current build.  please don't try to rely on that.

the bots not uploading debug info to GS isn't the same thing as the debug info not being available on the bot.  you have the sdk & the build in it, and the debug info should be in there for code to access.
Owner: derat@chromium.org
Status: Started (was: Untriaged)
Components: -Internals>Services>Ash
Labels: -Type-Bug Type-Feature
Summary: Tast should symbolize crashes on VM builders (was: Chrome segfaulted in ui.SingleProcessMashLogin on betty-pre-cq; no symbols uploaded)
Thanks, that makes sense.

I'm repurposing this bug to track making Tast symbolize these crashes automatically. I'm going to add a -symbolize flag to the "run" subcommand, but I also need to make us smarter when symbolizing multiple crash dumps and add a way to disable downloading debug_breakpad.tar.xz files (because downloading those in the lab would be a good way to make infra people angry with me really quickly).

Sign in to add a comment