Crashpad Windows trybots (and probably buildbots) are broken |
|||||||||||
Issue descriptionWe’ve been seeing a lot of this today: https://build.chromium.org/p/client.crashpad/builders/crashpad_try_win_x86_rel/builds/191 https://build.chromium.org/p/client.crashpad/builders/crashpad_try_win_x86_dbg/builds/192 The error message: @@@STEP_LOG_LINE@exception@WindowsError: [Error 216] This version of %1 is not compatible with the version of Windows you're running. Check your computer's system information to see whether you need a x86 (32-bit) or x64 (64-bit) version of the program, and then contact the software publisher@@@ Both say they’re on vm5-m3. This is our bot for 32-bit x86 builds on a 32-bit x86 OS, and it’s running something old-ish (Windows 7?) and is the fiddliest. Our 64-on-64 and 32-on-64 bots are running something newer and are OK.
,
Jul 20 2017
Infra-Troopers to try to attract more attention. We have broken bots.
,
Jul 20 2017
possibly a python packaging change?
,
Jul 20 2017
"client.crashpad" isn't a staging builder: https://chromium.googlesource.com/chromium/tools/build/+/2a78f892f17ab7884ad722e96e3cdd872536902d/scripts/slave/cipd_bootstrap_v2.py#28 It will not have these packages employed.
,
Jul 20 2017
Oh nevermind you meant the bootstrap. It might be, let me look.
,
Jul 20 2017
I SSH'd onto "vm5-m3". The Python installation looks fine. However, when I run "ninja.exe" in command-line, I get: > E:\b\depot_tools\ninja.exe $ ninja This version of E:\b\depot_tools\ninja.exe is not compatible with the version of Windows you're running. Check your computer's system information to see whether you need a x86 (32-bit) or x64 (64-bit) version of the program, and then contact the software publisher. $ file ninja.exe ninja.exe: PE32+ executable (console) x86-64, for MS Windows $ uname -a MINGW32_NT-6.1 VM5-M3 2.5.0(0.295/5/3) 2016-03-31 18:26 i686 Msys It looks like the "ninja.exe" bundled in "depot_tools" is 64-bit, but the OS is 32-bit, which is generally a direction that doesn't work: https://chromium.googlesource.com/chromium/tools/depot_tools/+/master/ninja.exe Not sure what the right solution here is. IMO the *best* solution is to use CIPD to distribute Ninja, but that is probably more nuanced than just a simple replacement.
,
Jul 20 2017
But why did it break now? ninja.exe hasn't been touched in 8 months...
,
Jul 20 2017
Is it possible that the machine was reprovisioned recently as a 32-bit system?
,
Jul 20 2017
No, the point of this system is to test on a 32-bit OS.
,
Jul 20 2017
Ah look at this: https://uberchromegw.corp.google.com/i/client.crashpad/builders/crashpad_try_win_x86_rel/builds/187/steps/compile%20with%20ninja/logs/stdio cmd: ['E:\\b\\depot_tools\\python276_bin\\ninja.exe', Looks like the "python276_bin" bundle included a 32-bit "ninja.exe" on PATH, which was preferred over the default one in depot_tools. That's pretty awful...
,
Jul 20 2017
Wait maybe not. I just downloaded the bundle and "ninja.exe" is not anywhere in it: https://storage.googleapis.com/chrome-infra/python276_bin.zip Is it possible that someone manually copied a 32-bit Ninja into this arbitrary path so that it would be included in PATH?
,
Jul 20 2017
Scott (cc-ed) might know about that, but he’s away.
,
Jul 20 2017
So AFAICT someone manually placed "ninja.exe" in this directory so that it would be included in PATH. They probably did this rather than put it in "depot_tools" directly b/c it would conflict with the checked-in "ninja.exe" there. This is obviously problematic for a variety of reasons: - It's mysterious and unmanaged, meaning that if the system were reprovisioned, this would disappear. - It fails to update - the ninja on the system is version 1.6.0 - the one at HEAD in "depot_tools" is 1.7.2. - It is abusing an implementation detail, namely that "python276_bin" is in PATH. When that implementation detail changed, this broke. I also don't see any 32-bit Windows Ninja releases on GitHub: https://github.com/ninja-build/ninja/releases A better thing to do would be to have builders explicitly install a "ninja" version based on their platform. +dpranke@, thoughts on where we should go from here?
,
Jul 20 2017
We don't support building on 32-bit systems. If you want to run tests on 32-bit, you need a builder/tester setup. (Or run tests on 32-bit swarming, if that works.)
,
Jul 20 2017
^^ What thakis said in #c14 , except that builder/tester split isn't really an option, either. The 32-bit tests need to be run under swarming if we want them to be run on a 32-bit system. And nothing should be building on 32-bit systems. I'm actually kinda surprised that we had working crashpad trybots; I thought we had flailed on that months ago and didn't realize we had eventually gotten it working. Let's not spend any more time on trying to figure out how this was working before, since I don't want us to support it regardless :).
,
Jul 20 2017
That sounds fine. Can you help us get it working now, then?
,
Jul 20 2017
Sure :).
,
Jul 20 2017
,
Jul 24 2017
Yes, I did the awful thing described in #c13. (My weak defense is that it had been months of broken bots, discussing how bots configs might be refactored some day, and whether and how we might be be able to swarm or split build/test or whatever, and putting a 32b ninja into the path solved the problem and took 5 minutes. Sorry.) These bots (and the x64 ones) also rely on a manually installed Visual Studio.
,
Jul 24 2017
No worries - we see one-off solutions all the time. In the past, it has been impossibly difficult to address some of the problems that our builders have had. Since then, we've developed a lot of solutions that help address these issues. So question: do we want to do 32-bit Ninja builds? If we do, there is a pretty sane solution (CIPD + Recipe Integration) now to deploy 32-bit Ninja. WRT Visual Studio, we can probably do a similar thing: we can bundle VS up w/ CIPD and deploy it as-needed via a recipe.
,
Jul 24 2017
> do we want to do 32-bit Ninja builds? No. We require 64-bit systems for building. So we need some 64-bit builder / 32-bit swarming tester setup here.
,
Jul 24 2017
OK, does that eliminate the Visual Studio issue from #19 as well?
,
Jul 24 2017
No, Crashpad doesn't fork all the junk from Chrome build/ to get the depot_tools toolchain, so it still would depend upon the Visual Studio system install.
,
Jul 24 2017
What Visual Studio version do you require?
,
Jul 24 2017
Visual Studio 2015 Update 3.
,
Jul 24 2017
I've been working on this stuff over the past few days, and I'm trying to fix this in the "right long-term way". In other words, we need to make the crashpad checkout hermetic so that it'll run properly under LUCI, which is where I'm configuring the builders. That means that we need to deps in depot_tools to get access to the ninja binaries, and then pull in the vs toolchain or the mac toolchain via hooks. Hopefully there aren't other dependencies, but I stumbled over at least those already.
,
Jul 25 2017
dpranke@: Another, probably better option is to just make CIPD packages out of the Ninja binaries and then have the recipe download them and add them to PATH before running the build. Making the CIPD packages from the current binary set is trivial; setting up a builder to continue to make those packages for new versions of Ninja is probably a little more involved, but still do-able. That way, you don't have to pull in the whole "depot_tools' monstrosity to get a binary (that shouldn't really be checked into depot_tools in the first place!)
,
Jul 29 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/f07d3eeb01f4de86064a595a98d9915fb6b69cb8 commit f07d3eeb01f4de86064a595a98d9915fb6b69cb8 Author: Dirk Pranke <dpranke@chromium.org> Date: Sat Jul 29 22:42:58 2017 Remove win x86 builders, we'll only build on x64 for now. Since we don't want to support building directly on 32-bit bots, the 32-bit builders on crashpad are effectively useless. This CL removes them. This means that the 32-bit-build-on-32-bit-hosts configuration is (still) untested and will be until we are running the tests under swarming. TBR=mark@chromium.org BUG=743139 Change-Id: I7faebadf4e6f9b349e057f2a7840cd8477e1e2b8 Reviewed-on: https://chromium-review.googlesource.com/592535 Reviewed-by: Dirk Pranke <dpranke@chromium.org> Commit-Queue: Dirk Pranke <dpranke@chromium.org> [modify] https://crrev.com/f07d3eeb01f4de86064a595a98d9915fb6b69cb8/scripts/slave/recipes/crashpad/build.py [delete] https://crrev.com/e90a27fb973a8aa79b72c486fbc31a2e4d7a3cf3/scripts/slave/recipes/crashpad/build.expected/crashpad_try_win_x86_rel.json [delete] https://crrev.com/e90a27fb973a8aa79b72c486fbc31a2e4d7a3cf3/scripts/slave/recipes/crashpad/build.expected/crashpad_win_x86_dbg.json [modify] https://crrev.com/f07d3eeb01f4de86064a595a98d9915fb6b69cb8/masters/master.client.crashpad/builders.pyl
,
Jul 30 2017
Mac, Win x64, and Win x86-on-x64 should be working fine again now. I've removed the the x86-on-x86 builders. I've also posted the CLs that I think we'll need to run the x86-on-x86 build on x86 bots under swarming here: https://chromium-review.googlesource.com/c/580375/ https://chromium-review.googlesource.com/c/580375/ Theoretically they can be landed in either order. Once they've both landed, we should have the tests running on x86-on-x86 again. There will still be further cleanup we can do, like consolidating down to one Win builder and running more tests in parallel, and moving completely to LUCI for the tryjobs, but this should be enough for now.
,
Jul 31 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/93ec940b1a043f51e2cdfc1673a183bd84064ce9 commit 93ec940b1a043f51e2cdfc1673a183bd84064ce9 Author: Dirk Pranke <dpranke@chromium.org> Date: Mon Jul 31 15:26:47 2017 Update crashpad recipe to supporting swarming. In order to move the bots onto LUCI-based tryservers and test on non-x64 platforms, we need to be running the tests under swarming rather than running them locally on the build machines. This CL adds in the initial support for this; the tests are run as before, but if the file //build/swarming_tests_spec.pyl exists, the tests listed in it will be run under swarming as well. BUG=697636, 743139 Change-Id: Ia5c6550a99f3e0c5138874547d8d86f68e180141 Reviewed-on: https://chromium-review.googlesource.com/580375 Reviewed-by: Mark Mentovai <mark@chromium.org> Commit-Queue: Dirk Pranke <dpranke@chromium.org> [modify] https://crrev.com/93ec940b1a043f51e2cdfc1673a183bd84064ce9/scripts/slave/recipes/crashpad/build.expected/crashpad_try_win_x86_wow64_dbg.json [modify] https://crrev.com/93ec940b1a043f51e2cdfc1673a183bd84064ce9/scripts/slave/recipes/crashpad/build.expected/crashpad_mac_dbg.json [modify] https://crrev.com/93ec940b1a043f51e2cdfc1673a183bd84064ce9/scripts/slave/recipes/crashpad/build.expected/crashpad_win_x64_rel.json [modify] https://crrev.com/93ec940b1a043f51e2cdfc1673a183bd84064ce9/scripts/slave/recipes/crashpad/build.expected/crashpad_try_win_x64_dbg.json [modify] https://crrev.com/93ec940b1a043f51e2cdfc1673a183bd84064ce9/scripts/slave/recipes/crashpad/build.expected/crashpad_win_x86_wow64_rel.json [modify] https://crrev.com/93ec940b1a043f51e2cdfc1673a183bd84064ce9/scripts/slave/recipes/crashpad/build.expected/crashpad_try_mac_rel.json [modify] https://crrev.com/93ec940b1a043f51e2cdfc1673a183bd84064ce9/scripts/slave/README.recipes.md [modify] https://crrev.com/93ec940b1a043f51e2cdfc1673a183bd84064ce9/scripts/slave/recipes/crashpad/build.py [modify] https://crrev.com/93ec940b1a043f51e2cdfc1673a183bd84064ce9/scripts/slave/recipes/crashpad/build.expected/crashpad_mac_dbg_clobber.json
,
Jul 31 2017
The trybots are mostly working now, but there’s a new “read swarming_test_spec” step that’s failing and sending e-mail. It’s noisy, but it doesn’t appear to be turning try jobs red. From https://chromium-cq-status.appspot.com/v2/patch-status/chromium-review.googlesource.com/593148/2: https://build.chromium.org/p/client.crashpad/builders/crashpad_try_mac_dbg/builds/213 https://build.chromium.org/p/client.crashpad/builders/crashpad_try_mac_rel/builds/211 https://build.chromium.org/p/client.crashpad/builders/crashpad_try_win_x86_wow64_dbg/builds/197 Errors like: @@@STEP_TEXT@UNKNOWN: [Errno 2] No such file or directory: '/b/build/slave/crashpad_try_mac_dbg/build/crashpad/build/swarming_test_spec.pyl'@@@ @@@STEP_FAILURE@@@
,
Jul 31 2017
bah, I thought my try/catch would catch that. I'll revert that change and try to fix this later today ...
,
Jul 31 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/c623f337dbcc93540a9f3fc01bcf9e51bb8a343c commit c623f337dbcc93540a9f3fc01bcf9e51bb8a343c Author: Dirk Pranke <dpranke@chromium.org> Date: Mon Jul 31 16:50:59 2017 Revert "Update crashpad recipe to supporting swarming." This reverts commit 93ec940b1a043f51e2cdfc1673a183bd84064ce9. Reason for revert: The try/catch doesn't appear to work for handling a missing file :(. Original change's description: > Update crashpad recipe to supporting swarming. > > In order to move the bots onto LUCI-based tryservers > and test on non-x64 platforms, we need to be running > the tests under swarming rather than running them locally > on the build machines. > > This CL adds in the initial support for this; the tests > are run as before, but if the file //build/swarming_tests_spec.pyl > exists, the tests listed in it will be run under swarming as well. > > BUG=697636, 743139 > > Change-Id: Ia5c6550a99f3e0c5138874547d8d86f68e180141 > Reviewed-on: https://chromium-review.googlesource.com/580375 > Reviewed-by: Mark Mentovai <mark@chromium.org> > Commit-Queue: Dirk Pranke <dpranke@chromium.org> TBR=dpranke@chromium.org,mark@chromium.org Change-Id: If22486d96c88299984d29698e9ac10368800c2b9 No-Presubmit: true No-Tree-Checks: true No-Try: true Bug: 697636, 743139 Reviewed-on: https://chromium-review.googlesource.com/594211 Reviewed-by: Dirk Pranke <dpranke@chromium.org> Commit-Queue: Dirk Pranke <dpranke@chromium.org> [modify] https://crrev.com/c623f337dbcc93540a9f3fc01bcf9e51bb8a343c/scripts/slave/recipes/crashpad/build.expected/crashpad_try_win_x86_wow64_dbg.json [modify] https://crrev.com/c623f337dbcc93540a9f3fc01bcf9e51bb8a343c/scripts/slave/recipes/crashpad/build.expected/crashpad_mac_dbg.json [modify] https://crrev.com/c623f337dbcc93540a9f3fc01bcf9e51bb8a343c/scripts/slave/recipes/crashpad/build.expected/crashpad_win_x64_rel.json [modify] https://crrev.com/c623f337dbcc93540a9f3fc01bcf9e51bb8a343c/scripts/slave/recipes/crashpad/build.expected/crashpad_try_win_x64_dbg.json [modify] https://crrev.com/c623f337dbcc93540a9f3fc01bcf9e51bb8a343c/scripts/slave/recipes/crashpad/build.expected/crashpad_win_x86_wow64_rel.json [modify] https://crrev.com/c623f337dbcc93540a9f3fc01bcf9e51bb8a343c/scripts/slave/recipes/crashpad/build.expected/crashpad_try_mac_rel.json [modify] https://crrev.com/c623f337dbcc93540a9f3fc01bcf9e51bb8a343c/scripts/slave/README.recipes.md [modify] https://crrev.com/c623f337dbcc93540a9f3fc01bcf9e51bb8a343c/scripts/slave/recipes/crashpad/build.py [modify] https://crrev.com/c623f337dbcc93540a9f3fc01bcf9e51bb8a343c/scripts/slave/recipes/crashpad/build.expected/crashpad_mac_dbg_clobber.json
,
Aug 2 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/be5c9f33947e8cd102be9b49d807ed7f1bc82e9a commit be5c9f33947e8cd102be9b49d807ed7f1bc82e9a Author: Dirk Pranke <dpranke@chromium.org> Date: Wed Aug 02 01:23:40 2017 Re-land "Update crashpad recipe to supporting swarming." This is attempt #2 to make the crashpad/build recipe support a src-side configuration of steps to run under swarming. Bug: 743139 Change-Id: I499ae4ec8fe2ba35473a2c3e6100d3b01bbc33b9 Reviewed-on: https://chromium-review.googlesource.com/596693 Reviewed-by: Mark Mentovai <mark@chromium.org> Commit-Queue: Dirk Pranke <dpranke@chromium.org> [modify] https://crrev.com/be5c9f33947e8cd102be9b49d807ed7f1bc82e9a/scripts/slave/recipes/crashpad/build.expected/crashpad_try_win_x86_wow64_dbg.json [modify] https://crrev.com/be5c9f33947e8cd102be9b49d807ed7f1bc82e9a/scripts/slave/recipes/crashpad/build.expected/crashpad_mac_dbg.json [modify] https://crrev.com/be5c9f33947e8cd102be9b49d807ed7f1bc82e9a/scripts/slave/recipes/crashpad/build.expected/crashpad_win_x64_rel.json [modify] https://crrev.com/be5c9f33947e8cd102be9b49d807ed7f1bc82e9a/scripts/slave/recipes/crashpad/build.expected/crashpad_try_win_x64_dbg.json [modify] https://crrev.com/be5c9f33947e8cd102be9b49d807ed7f1bc82e9a/scripts/slave/recipes/crashpad/build.expected/crashpad_win_x86_wow64_rel.json [modify] https://crrev.com/be5c9f33947e8cd102be9b49d807ed7f1bc82e9a/scripts/slave/recipes/crashpad/build.expected/crashpad_try_mac_rel.json [modify] https://crrev.com/be5c9f33947e8cd102be9b49d807ed7f1bc82e9a/scripts/slave/README.recipes.md [modify] https://crrev.com/be5c9f33947e8cd102be9b49d807ed7f1bc82e9a/scripts/slave/recipes/crashpad/build.py
,
Aug 7 2017
,
Sep 1 2017
I think the work is mostly done here, but there's still some that needs to be done to get x86-on-x86 working. mark@, I'm punting this back to you to take it from here.
,
Sep 2 2017
,
Oct 26 2017
The following revision refers to this bug: https://chromium.googlesource.com/crashpad/crashpad.git/+/5e9ed4cb9f69ffc60e47d7e5d8312f6417dcb323 commit 5e9ed4cb9f69ffc60e47d7e5d8312f6417dcb323 Author: Mark Mentovai <mark@chromium.org> Date: Thu Oct 26 18:31:57 2017 win: Dynamically disable WoW64 tests absent explicit 32-bit build output Rather than having the 64-bit build assume that it lives in out\{Debug,Release}_x64 and that it can find 32-bit build output in out\{Debug,Release}, require the location of 32-bit build output to be provided explicitly via the CRASHPAD_TEST_32_BIT_OUTPUT environment variable. If this variable is not set, 64-bit tests that require 32-bit test build output will dynamically disable themselves at runtime. In order for this to work, a new DISABLED_TEST() macro is added to support dynamically disabled tests. gtest does not have its own first-class support for this (https://groups.google.com/d/topic/googletestframework/Nwh3u7YFuN4, https://github.com/google/googletest/issues/490) so this local solution is used instead. For tests via Crashpad’s own build\run_tests.py, which is how Crashpad’s own buildbots and trybots invoke tests, CRASHPAD_TEST_32_BIT_OUTPUT is set to a locaton compatible with the paths expected for the GYP-based build. No test coverage is lost on Crashpad’s own buildbots and trybots. For Crashpad tests in Chromium’s buildbots and trybots, this environment variable will not be set, causing these tests to be dynamically disabled. Bug: crashpad:203 , chromium:743139, chromium:777924 Change-Id: I3c0de2bf4f835e13ed5a4adda5760d6fed508126 Reviewed-on: https://chromium-review.googlesource.com/739795 Commit-Queue: Mark Mentovai <mark@chromium.org> Reviewed-by: Scott Graham <scottmg@chromium.org> [modify] https://crrev.com/5e9ed4cb9f69ffc60e47d7e5d8312f6417dcb323/build/run_tests.py [modify] https://crrev.com/5e9ed4cb9f69ffc60e47d7e5d8312f6417dcb323/snapshot/win/end_to_end_test.py [modify] https://crrev.com/5e9ed4cb9f69ffc60e47d7e5d8312f6417dcb323/snapshot/win/exception_snapshot_win_test.cc [modify] https://crrev.com/5e9ed4cb9f69ffc60e47d7e5d8312f6417dcb323/snapshot/win/extra_memory_ranges_test.cc [modify] https://crrev.com/5e9ed4cb9f69ffc60e47d7e5d8312f6417dcb323/snapshot/win/pe_image_annotations_reader_test.cc [modify] https://crrev.com/5e9ed4cb9f69ffc60e47d7e5d8312f6417dcb323/snapshot/win/process_snapshot_win_test.cc [modify] https://crrev.com/5e9ed4cb9f69ffc60e47d7e5d8312f6417dcb323/test/gmock_main.cc [add] https://crrev.com/5e9ed4cb9f69ffc60e47d7e5d8312f6417dcb323/test/gtest_disabled.cc [add] https://crrev.com/5e9ed4cb9f69ffc60e47d7e5d8312f6417dcb323/test/gtest_disabled.h [modify] https://crrev.com/5e9ed4cb9f69ffc60e47d7e5d8312f6417dcb323/test/gtest_main.cc [modify] https://crrev.com/5e9ed4cb9f69ffc60e47d7e5d8312f6417dcb323/test/test.gyp [modify] https://crrev.com/5e9ed4cb9f69ffc60e47d7e5d8312f6417dcb323/test/test_paths.cc [modify] https://crrev.com/5e9ed4cb9f69ffc60e47d7e5d8312f6417dcb323/test/test_paths.h [modify] https://crrev.com/5e9ed4cb9f69ffc60e47d7e5d8312f6417dcb323/util/win/process_info_test.cc
,
Nov 2 2017
All right, I think we’re ready to pick this back up again.
For reference again, the configurations we care about:
# a) x64 OS, x64 handler, x64 client - currently tested by “x64”
# b) x64 OS, x64 handler, x86 client - currently tested by “x64”
# c) x64 OS, x86 handler, x86 client - currently tested by “x86_wow64”
# d) x86 OS, x86 handler, x86 client - currently missing coverage
With the GYP build set up as it is now, (a) uses {Debug,Release}_x64, (c) and (d) use {Debug,Release}, and (b) uses both.
If the tests are driven from run_tests.py, on a 64-bit tester, providing {Debug,Release}_x64 and {Debug,Release} as siblings and directing run_tests.py to the x64 version will actually test (a) and (b) together. If this is too difficult to start with, providing only {Debug,Release}_x64 without a {Debug,Release} sibling will test (a) only. We can temporarily lose (b) coverage provided we have some idea of how to regain it, and we regain (c) coverage at the same time.
(c) and (d) should be trivial now. Provide {Debug,Release} to (c) a 64-bit tester or (d) a 32-bit tester and it should just work.
It would be sufficient for builders to simply do both 64-bit and 32-bit builds for a single job, and to then ship the output off to testers as needed.
I’m not sure how to specify the executable and test data dependencies for swarming, but if you give me a reference, I’m happy to put it all together. I just did the same thing to be able to run the tests in Chromium’s GN files (https://chromium-review.googlesource.com/c/chromium/src/+/751403), and the relationships are pretty straightforward.
Dirk, if you’ve got cycles, I’m happy to move forward with this. If we can fix bug 697636 and add capacity at the same time, it’d be icing or gravy, depending on your perspective.
|
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by mark@chromium.org
, Jul 14 2017