New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 667475 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Dec 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 1
Type: Bug-Regression

Blocked on:
issue 551401

Blocking:
issue 596622



Sign in to add a comment

catapult tampers with Crashpad’s database resulting in reports that crashpad_databse_util can’t extract

Project Member Reported by kbr@chromium.org, Nov 21 2016

Issue description

chromium_try_flakes is indicating the WebGL conformance tests have just gotten a lot flakier today (November 21):
https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyLwsSBUZsYWtlIiR3ZWJnbF9jb25mb3JtYW5jZV90ZXN0cyAod2l0aCBwYXRjaCkM

and triaging a few of the failures:
https://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_ng/builds/340045
https://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_ng/builds/339989
https://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_ng/builds/339967

the crashes are happening in random tests.

Our group strongly prefers to not mark all of the tests flaky on a given platform, which would be the only workaround.

The problem is that the minidumps being generated appear to be corrupt, or at least, crashpad_database_util is unable to process them. The harness has code to symbolize these minidumps, but the process is failing. Here's one example from the first crash above:


WebglConformance_conformance_ogles_GL_biConstants_biConstants_001_to_008 (gpu_tests.webgl_conformance_integration_test.WebGLConformanceIntegrationTest) ... Received signal 11 <unknown> 000000000000
[0x00010b12d836]
[0x7fff863ee52a]
[0x00010e898549]
[0x0001101450b4]
[0x000110144d80]
[0x00011018292f]
[0x00011013dbc8]
[0x00011013df5c]
[0x00010f4b9d25]
[0x00010f0ceeeb]
[0x00010f06bd26]
[0x00010f8a2cea]
[0x00010f8a2ed4]
[0x00010f887f23]
[0x00010f887e2e]
[0x00010f3052f2]
[0x00010f3241a7]
[0x00010e6cd1f4]
[0x00010e6cdcc6]
[0x00010e69f36e]
[0x00010e6a1195]
[0x00010e69de35]
[0x00010e69d330]
[0x00010e6a2e00]
[0x00010b12e091]
[0x00010e89fc2b]
[0x00010e89e33d]
[0x00010b12e091]
[0x00010b167e4b]
[0x00010b16822c]
[0x00010b1686b3]
[0x00010b16c217]
[0x00010b152dda]
[0x00010b16bbe4]
[0x7fff949ae881]
[0x7fff9498dfbc]
[0x7fff9498d4df]
[0x7fff9498ced8]
[0x7fff91388ed9]
[0x00010b16c9fe]
[0x00010b16c067]
[0x00010b167b12]
[0x00010b198f53]
[0x000110384c5f]
[0x00010ab40c4f]
[0x00010ab3fce6]
[0x000108e6e22c]
[0x000108c36daa]
[0x7fff89aad5ad]
[0x00000000001c]
[end of stack trace]
(INFO) 2016-11-21 11:29:43,540 desktop_browser_backend._GetAllCrashpadMinidumps:361 Found crashpad_database_util
(INFO) 2016-11-21 11:29:43,555 desktop_browser_backend.GetStackTrace:530 Minidump found: /b/s/w/it5flpGh/tmpTqIIjk/completed/4d1e6939-a78a-4bb8-a4b4-1cc0a1078fa4.dmp
(INFO) 2016-11-21 11:29:43,555 cloud_storage.Insert:310 Uploading /b/s/w/it5flpGh/tmpTqIIjk/completed/4d1e6939-a78a-4bb8-a4b4-1cc0a1078fa4.dmp to gs://chrome-telemetry-output/minidump-2016-11-21_11-29-43-463004.dmp
(INFO) 2016-11-21 11:29:45,039 cloud_storage._GetLocked:272 Downloading gs://chromium-telemetry/binary_dependencies/minidump_stackwalk_76c5983fc9e9316a9d4251ba3e68b955c4fc9bf3 to /b/s/w/irpSnVRy/third_party/catapult/telemetry/telemetry/internal/bin/mac/x86_64/minidump_stackwalk
(INFO) 2016-11-21 11:29:45,629 desktop_browser_backend.GenerateBreakpadSymbols:79 Dumping breakpad symbols.
(INFO) 2016-11-21 11:29:45,629 cloud_storage._GetLocked:272 Downloading gs://chromium-telemetry/binary_dependencies/minidump_dump_c39bd7a3b9fa6279893b2d759045699d79ce4dcb to /b/s/w/irpSnVRy/third_party/catapult/telemetry/telemetry/internal/bin/mac/x86_64/minidump_dump
(INFO) 2016-11-21 11:29:58,189 desktop_browser_backend._GetAllCrashpadMinidumps:361 Found crashpad_database_util
[1121/112958:WARNING:crash_report_database_mac.mm(697)] Failed to read report metadata for /b/s/w/it5flpGh/tmpTqIIjk/completed/4d1e6939-a78a-4bb8-a4b4-1cc0a1078fa4.dmp.stripped


The underlying bug is probably leading to a renderer process crash, but without a stack trace it'll be impossible to make progress.

Could folks more familiar with the crash reporting infrastructure help me understand what the next step would be? Thanks.

Note that unfortunately this problem doesn't seem to be happening with the same frequency on the waterfall:

https://build.chromium.org/p/chromium.gpu/builders/Mac%2010.10%20Retina%20Release%20%28AMD%29?numbuilds=200
https://build.chromium.org/p/chromium.gpu/builders/Mac%2010.10%20Retina%20Debug%20%28AMD%29?numbuilds=200
https://build.chromium.org/p/chromium.gpu/builders/Mac%2010.10%20Release%20%28Intel%29?numbuilds=200
https://build.chromium.org/p/chromium.gpu/builders/Mac%2010.10%20Debug%20%28Intel%29?numbuilds=200

 

Comment 1 by kbr@chromium.org, Nov 21 2016

Labels: OS-Mac

Comment 2 by mark@chromium.org, Nov 21 2016

Cc: dyen@chromium.org
Summary: catapult tampers with Crashpad’s database resulting in reports that crashpad_databse_util can’t extract (was: crashpad_database_util unable to read minidumps on bots)
crashpad_database_util doesn’t read minidumps. This infrastructure uses crashpad_database_util to pull minidumps out of the database.

I don’t know what a .stripped file is, but it’s not generated by Crashpad. Crashpad owns its own database. Whoever is messing around in there by hand and creating these .stripped files needs to stop it.

It’s this, I guess:

https://codesearch.chromium.org/chromium/src/third_party/catapult/telemetry/telemetry/internal/backends/chrome/desktop_browser_backend.py

Upstream:

https://github.com/catapult-project/catapult/blob/master/telemetry/telemetry/internal/backends/chrome/desktop_browser_backend.py

Comment 3 by kbr@chromium.org, Nov 21 2016

Blockedon: 551401
Cc: -dyen@chromium.org
Components: -Blink>JavaScript
Owner: nedngu...@google.com
Status: Assigned (was: Untriaged)
Sorry for my confusion and thanks for the clarification Mark.

Emily, Ned, could you help me understand what's happening here? Is the browser's GetAllUnsymbolizedMinidumpPaths supposed to be filtering out these .stripped files? It seems to me that DesktopBrowserBackend._GetStackFromMinidump shouldn't be writing them in that directory in the first place.

Comment 4 by kbr@chromium.org, Nov 21 2016

I found one crash stack on Linux which pointed to  Issue 537054  as the root cause of these crashes. I'm fixing that with high priority now.

Owner: eyaich@chromium.org

Comment 6 by mark@chromium.org, Nov 22 2016

That infile.read().partition('MDMP') thing that catapult is doing is totally bogus for a Crashpad-produced minidump anyway. Minidump-format dumps in a Crashpad database will never not lead with the 'MDMP' minidump signature.

Comment 7 by kbr@chromium.org, Nov 30 2016

This is preventing minidumps from being symbolized in our test runs, so we can't diagnose the reasons for test failures. One recent example:

https://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_ng/builds/345179

WebglConformance_conformance_ogles_GL_vec3_vec3_001_to_008 (gpu_tests.webgl_conformance_integration_test.WebGLConformanceIntegrationTest) ... Received signal 11 <unknown> 000000000000
[0x00010b842e66]
[0x7fff93b7a52a]
[0x00010ced7de4]
[0x000110b09ad4]
[0x000110b096a0]
[0x000110b5543f]
[0x000110b01fbf]
[0x000110b0234c]
[0x00010fdd8b65]
[0x00010f96777c]
[0x0001101e2f2a]
[0x0001101e3187]
[0x0001101c4d33]
[0x0001101c4c3e]
[0x00010fc0fd62]
[0x00010fc2f3d7]
[0x00010ef5d8e5]
[0x00010ef5e326]
[0x00010ef2ec21]
[0x00010ef30cea]
[0x00010ef2d6d5]
[0x00010ef2cbe0]
[0x00010ef32c38]
[0x00010ef32fdb]
[0x00010b8436d1]
[0x00010f139a62]
[0x00010f13808a]
[0x00010f13aec4]
[0x00010b8436d1]
[0x00010b87e059]
[0x00010b87e41c]
[0x00010b87e913]
[0x00010b882957]
[0x00010b867fda]
[0x00010b882324]
[0x7fff9b5b37e1]
[0x7fff9b592f1c]
[0x7fff9b59243f]
[0x7fff9b591e38]
[0x7fff8e14ced9]
[0x00010b8832be]
[0x00010b8827a7]
[0x00010b87dda2]
[0x00010b8b2983]
[0x000110d67c43]
[0x00010b225ed7]
[0x00010b224f36]
[0x000109440d8c]
[0x000109209daa]
[0x7fff9cef55ad]
[end of stack trace]
(INFO) 2016-11-30 10:04:11,873 desktop_browser_backend._GetAllCrashpadMinidumps:361 Found crashpad_database_util
(INFO) 2016-11-30 10:04:11,887 desktop_browser_backend.GetStackTrace:530 Minidump found: /b/s/w/itFZ7RZ4/tmp_ut_q1/completed/cfedb169-919a-49f3-bcc1-493e204c8c37.dmp
(INFO) 2016-11-30 10:04:11,887 cloud_storage.Insert:312 Uploading /b/s/w/itFZ7RZ4/tmp_ut_q1/completed/cfedb169-919a-49f3-bcc1-493e204c8c37.dmp to gs://chrome-telemetry-output/minidump-2016-11-30_10-04-11-988903.dmp
(INFO) 2016-11-30 10:04:12,954 cloud_storage._GetLocked:274 Downloading gs://chromium-telemetry/binary_dependencies/minidump_stackwalk_76c5983fc9e9316a9d4251ba3e68b955c4fc9bf3 to /b/s/w/irVyfbo3/third_party/catapult/telemetry/telemetry/internal/bin/mac/x86_64/minidump_stackwalk
(INFO) 2016-11-30 10:04:13,520 desktop_browser_backend.GenerateBreakpadSymbols:79 Dumping breakpad symbols.
(INFO) 2016-11-30 10:04:13,521 cloud_storage._GetLocked:274 Downloading gs://chromium-telemetry/binary_dependencies/minidump_dump_c39bd7a3b9fa6279893b2d759045699d79ce4dcb to /b/s/w/irVyfbo3/third_party/catapult/telemetry/telemetry/internal/bin/mac/x86_64/minidump_dump
(INFO) 2016-11-30 10:04:25,549 desktop_browser_backend._GetAllCrashpadMinidumps:361 Found crashpad_database_util
[1130/100425.559700:WARNING:crash_report_database_mac.mm(697)] Failed to read report metadata for /b/s/w/itFZ7RZ4/tmp_ut_q1/completed/cfedb169-919a-49f3-bcc1-493e204c8c37.dmp.stripped


and no stack trace was produced.

This is critical. If nobody can work on this in the short term can I take it and remove the associated code from Telemetry?

Comment 8 by kbr@chromium.org, Nov 30 2016

Cc: -rsesek@chromium.org kainino@chromium.org
For the short term fix, can we make the copy of the original files to a place & your test code can upload those to cloud storage for later investigation?

Comment 10 by kbr@chromium.org, Nov 30 2016

Sorry but that doesn't help. The minidumps are already being uploaded to cloud storage. 
minidump_stackwalk requires the binaries which produced the crash, and that's what's painful to reconstruct. I just tried downloading the minidump from the above crash and got the attached result, which had no symbols for Chromium Helper and Chromium Framework. We really need the minidumps processed on the machine which generated them, at the point of the crash. The .stripped files are bogus and we should just stop producing them.

symbolized-minidump.txt
54.3 KB View Download
Ok lets start with removing this logic for mac.  It is not clear to me if it is needed for win/linux so I am leaving for those platforms until I get to the root of the problem.  

CL coming shortly for just removing the stripped suffix and we will see what that does for us. 

Comment 12 by mark@chromium.org, Dec 1 2016

Windows is a Crashpad platform too.

You should remove this logic for any Crashpad platform. Since it looks like you’re dealing with crashpad_database_util around here, it seems like you know when you’re talking to a Crashpad database. You shouldn’t need to do that “.stripped” file thing in any Crashpad database, and you shouldn’t have a need to do it for any crash report written by Crashpad.
For the record, I've got all these crashes I can't analyze because of this:
https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.10%20Release%20%28Intel%29/builds/19737
https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.10%20Release%20%28Intel%29/builds/19699
https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.10%20Release%20%28Intel%29/builds/19616
https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.10%20Release%20%28Intel%29/builds/19613
https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.10%20Release%20%28Intel%29/builds/19587

https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%20Retina%20Release/builds/9326
https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%20Retina%20Release/builds/9307
https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%20Retina%20Release/builds/9249
https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%20Retina%20Release/builds/9239
https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%20Retina%20Release/builds/9185
https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%20Retina%20Release/builds/9182

https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.10%20Retina%20Release%20%28AMD%29/builds/11140
https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.10%20Retina%20Release%20%28AMD%29/builds/11079
https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.10%20Retina%20Release%20%28AMD%29/builds/11056
https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.10%20Retina%20Release%20%28AMD%29/builds/11021
https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.10%20Retina%20Release%20%28AMD%29/builds/10982
https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.10%20Retina%20Release%20%28AMD%29/builds/10956

https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.11%20Retina%20Release%20%28AMD%29/builds/1047
https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.11%20Retina%20Release%20%28AMD%29/builds/1008
https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.11%20Retina%20Release%20%28AMD%29/builds/979
https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.11%20Retina%20Release%20%28AMD%29/builds/973
https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.11%20Retina%20Release%20%28AMD%29/builds/970

Project Member

Comment 14 by bugdroid1@chromium.org, Dec 2 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e248b869b5b687c1bdbf88203b9824a260a7d404

commit e248b869b5b687c1bdbf88203b9824a260a7d404
Author: catapult-deps-roller <catapult-deps-roller@chromium.org>
Date: Fri Dec 02 21:10:27 2016

Roll src/third_party/catapult/ a631bc329..8d05c456e (4 commits).

https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/a631bc329824..8d05c456e60b

$ git log a631bc329..8d05c456e --date=short --no-merges --format='%ad %ae %s'
2016-12-02 jessimb Fixed the big referenced in #2964, nudge should always work now. Moved the dropdown menu options out of the drowdown so when the change in the UI corresponds better to the user input. This helps to makes it more clear that +1 is one point higher than the initial point, not where you are now.
2016-12-02 benjhayden Style left, right buttons in groupby-picker.
2016-12-02 aiolos Update the Chrome Stable channel reference build.
2016-12-02 eyaich Removing the .stripped suffix from minidumps produced by crashpad.

BUG= 667475 

Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, see:
http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls

CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel
TBR=catapult-sheriff@chromium.org

Review-Url: https://codereview.chromium.org/2547943002
Cr-Commit-Position: refs/heads/master@{#436015}

[modify] https://crrev.com/e248b869b5b687c1bdbf88203b9824a260a7d404/DEPS

Comment 15 by kbr@chromium.org, Dec 2 2016

Cc: ynovikov@chromium.org
Status: Fixed (was: Assigned)
Thanks Emily for this fix, which was reviewed in:
https://codereview.chromium.org/2545933002

Closing as Fixed. Hasn't been explicitly verified yet.

Sign in to add a comment