Issue metadata
Sign in to add a comment
|
catapult tampers with Crashpad’s database resulting in reports that crashpad_databse_util can’t extract |
||||||||||||||||||||||
Issue descriptionchromium_try_flakes is indicating the WebGL conformance tests have just gotten a lot flakier today (November 21): https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyLwsSBUZsYWtlIiR3ZWJnbF9jb25mb3JtYW5jZV90ZXN0cyAod2l0aCBwYXRjaCkM and triaging a few of the failures: https://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_ng/builds/340045 https://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_ng/builds/339989 https://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_ng/builds/339967 the crashes are happening in random tests. Our group strongly prefers to not mark all of the tests flaky on a given platform, which would be the only workaround. The problem is that the minidumps being generated appear to be corrupt, or at least, crashpad_database_util is unable to process them. The harness has code to symbolize these minidumps, but the process is failing. Here's one example from the first crash above: WebglConformance_conformance_ogles_GL_biConstants_biConstants_001_to_008 (gpu_tests.webgl_conformance_integration_test.WebGLConformanceIntegrationTest) ... Received signal 11 <unknown> 000000000000 [0x00010b12d836] [0x7fff863ee52a] [0x00010e898549] [0x0001101450b4] [0x000110144d80] [0x00011018292f] [0x00011013dbc8] [0x00011013df5c] [0x00010f4b9d25] [0x00010f0ceeeb] [0x00010f06bd26] [0x00010f8a2cea] [0x00010f8a2ed4] [0x00010f887f23] [0x00010f887e2e] [0x00010f3052f2] [0x00010f3241a7] [0x00010e6cd1f4] [0x00010e6cdcc6] [0x00010e69f36e] [0x00010e6a1195] [0x00010e69de35] [0x00010e69d330] [0x00010e6a2e00] [0x00010b12e091] [0x00010e89fc2b] [0x00010e89e33d] [0x00010b12e091] [0x00010b167e4b] [0x00010b16822c] [0x00010b1686b3] [0x00010b16c217] [0x00010b152dda] [0x00010b16bbe4] [0x7fff949ae881] [0x7fff9498dfbc] [0x7fff9498d4df] [0x7fff9498ced8] [0x7fff91388ed9] [0x00010b16c9fe] [0x00010b16c067] [0x00010b167b12] [0x00010b198f53] [0x000110384c5f] [0x00010ab40c4f] [0x00010ab3fce6] [0x000108e6e22c] [0x000108c36daa] [0x7fff89aad5ad] [0x00000000001c] [end of stack trace] (INFO) 2016-11-21 11:29:43,540 desktop_browser_backend._GetAllCrashpadMinidumps:361 Found crashpad_database_util (INFO) 2016-11-21 11:29:43,555 desktop_browser_backend.GetStackTrace:530 Minidump found: /b/s/w/it5flpGh/tmpTqIIjk/completed/4d1e6939-a78a-4bb8-a4b4-1cc0a1078fa4.dmp (INFO) 2016-11-21 11:29:43,555 cloud_storage.Insert:310 Uploading /b/s/w/it5flpGh/tmpTqIIjk/completed/4d1e6939-a78a-4bb8-a4b4-1cc0a1078fa4.dmp to gs://chrome-telemetry-output/minidump-2016-11-21_11-29-43-463004.dmp (INFO) 2016-11-21 11:29:45,039 cloud_storage._GetLocked:272 Downloading gs://chromium-telemetry/binary_dependencies/minidump_stackwalk_76c5983fc9e9316a9d4251ba3e68b955c4fc9bf3 to /b/s/w/irpSnVRy/third_party/catapult/telemetry/telemetry/internal/bin/mac/x86_64/minidump_stackwalk (INFO) 2016-11-21 11:29:45,629 desktop_browser_backend.GenerateBreakpadSymbols:79 Dumping breakpad symbols. (INFO) 2016-11-21 11:29:45,629 cloud_storage._GetLocked:272 Downloading gs://chromium-telemetry/binary_dependencies/minidump_dump_c39bd7a3b9fa6279893b2d759045699d79ce4dcb to /b/s/w/irpSnVRy/third_party/catapult/telemetry/telemetry/internal/bin/mac/x86_64/minidump_dump (INFO) 2016-11-21 11:29:58,189 desktop_browser_backend._GetAllCrashpadMinidumps:361 Found crashpad_database_util [1121/112958:WARNING:crash_report_database_mac.mm(697)] Failed to read report metadata for /b/s/w/it5flpGh/tmpTqIIjk/completed/4d1e6939-a78a-4bb8-a4b4-1cc0a1078fa4.dmp.stripped The underlying bug is probably leading to a renderer process crash, but without a stack trace it'll be impossible to make progress. Could folks more familiar with the crash reporting infrastructure help me understand what the next step would be? Thanks. Note that unfortunately this problem doesn't seem to be happening with the same frequency on the waterfall: https://build.chromium.org/p/chromium.gpu/builders/Mac%2010.10%20Retina%20Release%20%28AMD%29?numbuilds=200 https://build.chromium.org/p/chromium.gpu/builders/Mac%2010.10%20Retina%20Debug%20%28AMD%29?numbuilds=200 https://build.chromium.org/p/chromium.gpu/builders/Mac%2010.10%20Release%20%28Intel%29?numbuilds=200 https://build.chromium.org/p/chromium.gpu/builders/Mac%2010.10%20Debug%20%28Intel%29?numbuilds=200
,
Nov 21 2016
crashpad_database_util doesn’t read minidumps. This infrastructure uses crashpad_database_util to pull minidumps out of the database. I don’t know what a .stripped file is, but it’s not generated by Crashpad. Crashpad owns its own database. Whoever is messing around in there by hand and creating these .stripped files needs to stop it. It’s this, I guess: https://codesearch.chromium.org/chromium/src/third_party/catapult/telemetry/telemetry/internal/backends/chrome/desktop_browser_backend.py Upstream: https://github.com/catapult-project/catapult/blob/master/telemetry/telemetry/internal/backends/chrome/desktop_browser_backend.py
,
Nov 21 2016
Sorry for my confusion and thanks for the clarification Mark. Emily, Ned, could you help me understand what's happening here? Is the browser's GetAllUnsymbolizedMinidumpPaths supposed to be filtering out these .stripped files? It seems to me that DesktopBrowserBackend._GetStackFromMinidump shouldn't be writing them in that directory in the first place.
,
Nov 21 2016
I found one crash stack on Linux which pointed to Issue 537054 as the root cause of these crashes. I'm fixing that with high priority now.
,
Nov 21 2016
,
Nov 22 2016
That infile.read().partition('MDMP') thing that catapult is doing is totally bogus for a Crashpad-produced minidump anyway. Minidump-format dumps in a Crashpad database will never not lead with the 'MDMP' minidump signature.
,
Nov 30 2016
This is preventing minidumps from being symbolized in our test runs, so we can't diagnose the reasons for test failures. One recent example: https://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_ng/builds/345179 WebglConformance_conformance_ogles_GL_vec3_vec3_001_to_008 (gpu_tests.webgl_conformance_integration_test.WebGLConformanceIntegrationTest) ... Received signal 11 <unknown> 000000000000 [0x00010b842e66] [0x7fff93b7a52a] [0x00010ced7de4] [0x000110b09ad4] [0x000110b096a0] [0x000110b5543f] [0x000110b01fbf] [0x000110b0234c] [0x00010fdd8b65] [0x00010f96777c] [0x0001101e2f2a] [0x0001101e3187] [0x0001101c4d33] [0x0001101c4c3e] [0x00010fc0fd62] [0x00010fc2f3d7] [0x00010ef5d8e5] [0x00010ef5e326] [0x00010ef2ec21] [0x00010ef30cea] [0x00010ef2d6d5] [0x00010ef2cbe0] [0x00010ef32c38] [0x00010ef32fdb] [0x00010b8436d1] [0x00010f139a62] [0x00010f13808a] [0x00010f13aec4] [0x00010b8436d1] [0x00010b87e059] [0x00010b87e41c] [0x00010b87e913] [0x00010b882957] [0x00010b867fda] [0x00010b882324] [0x7fff9b5b37e1] [0x7fff9b592f1c] [0x7fff9b59243f] [0x7fff9b591e38] [0x7fff8e14ced9] [0x00010b8832be] [0x00010b8827a7] [0x00010b87dda2] [0x00010b8b2983] [0x000110d67c43] [0x00010b225ed7] [0x00010b224f36] [0x000109440d8c] [0x000109209daa] [0x7fff9cef55ad] [end of stack trace] (INFO) 2016-11-30 10:04:11,873 desktop_browser_backend._GetAllCrashpadMinidumps:361 Found crashpad_database_util (INFO) 2016-11-30 10:04:11,887 desktop_browser_backend.GetStackTrace:530 Minidump found: /b/s/w/itFZ7RZ4/tmp_ut_q1/completed/cfedb169-919a-49f3-bcc1-493e204c8c37.dmp (INFO) 2016-11-30 10:04:11,887 cloud_storage.Insert:312 Uploading /b/s/w/itFZ7RZ4/tmp_ut_q1/completed/cfedb169-919a-49f3-bcc1-493e204c8c37.dmp to gs://chrome-telemetry-output/minidump-2016-11-30_10-04-11-988903.dmp (INFO) 2016-11-30 10:04:12,954 cloud_storage._GetLocked:274 Downloading gs://chromium-telemetry/binary_dependencies/minidump_stackwalk_76c5983fc9e9316a9d4251ba3e68b955c4fc9bf3 to /b/s/w/irVyfbo3/third_party/catapult/telemetry/telemetry/internal/bin/mac/x86_64/minidump_stackwalk (INFO) 2016-11-30 10:04:13,520 desktop_browser_backend.GenerateBreakpadSymbols:79 Dumping breakpad symbols. (INFO) 2016-11-30 10:04:13,521 cloud_storage._GetLocked:274 Downloading gs://chromium-telemetry/binary_dependencies/minidump_dump_c39bd7a3b9fa6279893b2d759045699d79ce4dcb to /b/s/w/irVyfbo3/third_party/catapult/telemetry/telemetry/internal/bin/mac/x86_64/minidump_dump (INFO) 2016-11-30 10:04:25,549 desktop_browser_backend._GetAllCrashpadMinidumps:361 Found crashpad_database_util [1130/100425.559700:WARNING:crash_report_database_mac.mm(697)] Failed to read report metadata for /b/s/w/itFZ7RZ4/tmp_ut_q1/completed/cfedb169-919a-49f3-bcc1-493e204c8c37.dmp.stripped and no stack trace was produced. This is critical. If nobody can work on this in the short term can I take it and remove the associated code from Telemetry?
,
Nov 30 2016
,
Nov 30 2016
For the short term fix, can we make the copy of the original files to a place & your test code can upload those to cloud storage for later investigation?
,
Nov 30 2016
Sorry but that doesn't help. The minidumps are already being uploaded to cloud storage. minidump_stackwalk requires the binaries which produced the crash, and that's what's painful to reconstruct. I just tried downloading the minidump from the above crash and got the attached result, which had no symbols for Chromium Helper and Chromium Framework. We really need the minidumps processed on the machine which generated them, at the point of the crash. The .stripped files are bogus and we should just stop producing them.
,
Dec 1 2016
Ok lets start with removing this logic for mac. It is not clear to me if it is needed for win/linux so I am leaving for those platforms until I get to the root of the problem. CL coming shortly for just removing the stripped suffix and we will see what that does for us.
,
Dec 1 2016
Windows is a Crashpad platform too. You should remove this logic for any Crashpad platform. Since it looks like you’re dealing with crashpad_database_util around here, it seems like you know when you’re talking to a Crashpad database. You shouldn’t need to do that “.stripped” file thing in any Crashpad database, and you shouldn’t have a need to do it for any crash report written by Crashpad.
,
Dec 2 2016
For the record, I've got all these crashes I can't analyze because of this: https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.10%20Release%20%28Intel%29/builds/19737 https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.10%20Release%20%28Intel%29/builds/19699 https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.10%20Release%20%28Intel%29/builds/19616 https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.10%20Release%20%28Intel%29/builds/19613 https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.10%20Release%20%28Intel%29/builds/19587 https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%20Retina%20Release/builds/9326 https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%20Retina%20Release/builds/9307 https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%20Retina%20Release/builds/9249 https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%20Retina%20Release/builds/9239 https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%20Retina%20Release/builds/9185 https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%20Retina%20Release/builds/9182 https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.10%20Retina%20Release%20%28AMD%29/builds/11140 https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.10%20Retina%20Release%20%28AMD%29/builds/11079 https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.10%20Retina%20Release%20%28AMD%29/builds/11056 https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.10%20Retina%20Release%20%28AMD%29/builds/11021 https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.10%20Retina%20Release%20%28AMD%29/builds/10982 https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.10%20Retina%20Release%20%28AMD%29/builds/10956 https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.11%20Retina%20Release%20%28AMD%29/builds/1047 https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.11%20Retina%20Release%20%28AMD%29/builds/1008 https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.11%20Retina%20Release%20%28AMD%29/builds/979 https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.11%20Retina%20Release%20%28AMD%29/builds/973 https://build.chromium.org/p/chromium.gpu.fyi/builders/Mac%2010.11%20Retina%20Release%20%28AMD%29/builds/970
,
Dec 2 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e248b869b5b687c1bdbf88203b9824a260a7d404 commit e248b869b5b687c1bdbf88203b9824a260a7d404 Author: catapult-deps-roller <catapult-deps-roller@chromium.org> Date: Fri Dec 02 21:10:27 2016 Roll src/third_party/catapult/ a631bc329..8d05c456e (4 commits). https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/a631bc329824..8d05c456e60b $ git log a631bc329..8d05c456e --date=short --no-merges --format='%ad %ae %s' 2016-12-02 jessimb Fixed the big referenced in #2964, nudge should always work now. Moved the dropdown menu options out of the drowdown so when the change in the UI corresponds better to the user input. This helps to makes it more clear that +1 is one point higher than the initial point, not where you are now. 2016-12-02 benjhayden Style left, right buttons in groupby-picker. 2016-12-02 aiolos Update the Chrome Stable channel reference build. 2016-12-02 eyaich Removing the .stripped suffix from minidumps produced by crashpad. BUG= 667475 Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, see: http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel TBR=catapult-sheriff@chromium.org Review-Url: https://codereview.chromium.org/2547943002 Cr-Commit-Position: refs/heads/master@{#436015} [modify] https://crrev.com/e248b869b5b687c1bdbf88203b9824a260a7d404/DEPS
,
Dec 2 2016
Thanks Emily for this fix, which was reviewed in: https://codereview.chromium.org/2545933002 Closing as Fixed. Hasn't been explicitly verified yet. |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by kbr@chromium.org
, Nov 21 2016