Windows GPU bots failing to find crash_service.exe |
|||||||||
Issue descriptionMultiple Windows GPU bots on the main and fyi waterfalls are hitting an exception in the start_crash_service step, with output: Unable to find E:\b\build\slave\Win7_Release__NVIDIA_\build\src\out\Release\crash_service.exe Build logs: https://build.chromium.org/p/chromium.gpu/builders/Win7%20Release%20%28NVIDIA%29/builds/47556 https://build.chromium.org/p/chromium.gpu/builders/Win7%20Debug%20%28NVIDIA%29/builds/39067 https://build.chromium.org/p/chromium.gpu/builders/Win7%20Release%20(ATI) https://build.chromium.org/p/chromium.gpu.fyi/builders/Win8%20Release%20%28NVIDIA%29/builds/21494 https://build.chromium.org/p/chromium.gpu.fyi/builders/Win7%20Release%20%28ATI%29/builds/19646
,
Apr 8 2016
Oh god no. We went through so much to make sure this worked after it landed, and now it has been reverted? Context: https://bugs.chromium.org/p/chromium/issues/detail?id=601640
,
Apr 8 2016
And context for why the CL was reverted: https://bugs.chromium.org/p/chromium/issues/detail?id=601762
,
Apr 8 2016
Story: Original crash_service removal CL: https://codereview.chromium.org/1862773003 This caused the Win GPU bots to start failing, because they were still running the crash_service (because it doesn't get killed at the end of a build) but weren't using it: https://bugs.chromium.org/p/chromium/issues/detail?id=601640 Fix for that was to auto_reboot: https://codereview.chromium.org/1866403003 However at the same time the GPU bots on the perf waterfall started failing to compile/link, due to being unable to find crash_service: https://bugs.chromium.org/p/chromium/issues/detail?id=601762 This was fixed by reverting the crash_service recipe CL: https://codereview.chromium.org/1871583004 So it looks to me like the crash_service recipe CL maybe removed *too many* references to it? Or maybe the problem is simply that the perf gpu bots and the main waterfall gpu bots are building slightly different targets that don't have their dependencies in sync, so the latter still need the crash service?
,
Apr 8 2016
In particular, note that a successful build[1] on the perf waterfall explicitly builds the crash_service target (very end of line at the top of the log) while a failing build[2] doesn't specify that target on the command line at all. [1]: https://build.chromium.org/p/chromium.perf/builders/Win%20Builder/builds/6882/steps/compile/logs/stdio [2]: https://build.chromium.org/p/chromium.perf/builders/Win%20Builder/builds/6881/steps/compile/logs/stdio
,
Apr 8 2016
And now sergiyb has reverted the revert: https://codereview.chromium.org/1867293003 The perf bots have broken again: https://build.chromium.org/p/chromium.perf/builders/Win%20Builder/builds/6887 The main waterfall bots haven't cycled again yet.
,
Apr 8 2016
dpranke, I need your help to figure out why the crash_service recipe CL would cause compile/link to fail on the perf bots. I don't have enough context there to know why that is happening, or if any src-side changes need to land to fix it.
,
Apr 8 2016
Strangely, despite the revert of the revert, we now also have a green build on the perf bots: https://build.chromium.org/p/chromium.perf/builders/Win%20Builder/builds/6886
,
Apr 8 2016
Dirk's crash_service CL should remain in. Reverting it was a mistake. We need to understand why the Perf builders are still trying to build crash_service. I've found a couple of references but they're in the GN build.
,
Apr 8 2016
To clarify: Dirk's CL https://codereview.chromium.org/1862773003 will stop manually adding crash_service to the list of built targets. The question is why the Perf bots are trying to run the manifest tool against the nonexistent crash_service binary.
,
Apr 8 2016
If you search the compile logs on the perf bots, they're still trying to STAMP the crash_service: https://build.chromium.org/p/chromium.perf/builders/Win%20Builder/builds/6892/steps/compile/logs/stdio
,
Apr 8 2016
Which means my naive guess would be that the 'chrome_builder_perf' compile target depends on the crash service?
,
Apr 8 2016
,
Apr 8 2016
,
Apr 8 2016
Current theory is that prior to my CL, we were running crash_service on all of the bots, including the builders. When we landed (or re-landed) my CL, we stopped killing any existing crash_service processes, and so the next compile on the builder failed because the file was still open. I've landed a follow-up CL that will still try to kill any crash_service processes, and hopefully that'll fix things, or at least reveal the next issue ;).
,
Apr 8 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/4cc1754d4ddb33f39dd9fe9cc9518f78bd840091 commit 4cc1754d4ddb33f39dd9fe9cc9518f78bd840091 Author: kbr <kbr@chromium.org> Date: Fri Apr 08 23:14:35 2016 Remove stray references to crash_service. Stop bundling it in the installer (FILES.cfg). Remove the references to the crash_service target from the perf builders' targets. (Issue 601762) It's necessary to proceed with the removal of the start_crash_service step on the bots, because putting it back now breaks the GPU bots ( Issue 601839 ). The BUILD.gn removals are proactive, and not actually used yet. BUG= 601762 , 601839 Review URL: https://codereview.chromium.org/1875613004 Cr-Commit-Position: refs/heads/master@{#386239} [modify] https://crrev.com/4cc1754d4ddb33f39dd9fe9cc9518f78bd840091/BUILD.gn [modify] https://crrev.com/4cc1754d4ddb33f39dd9fe9cc9518f78bd840091/build/all.gyp [modify] https://crrev.com/4cc1754d4ddb33f39dd9fe9cc9518f78bd840091/chrome/tools/build/win/FILES.cfg [modify] https://crrev.com/4cc1754d4ddb33f39dd9fe9cc9518f78bd840091/tools/perf/chrome_telemetry_build/BUILD.gn
,
Apr 9 2016
dpranke's CL fixed the Perf builders. Mine helps ensure his CL won't be reverted again, which would break the GPU bots. Closing as fixed.
,
Apr 27 2016
|
|||||||||
►
Sign in to add a comment |
|||||||||
Comment 1 by jo...@chromium.org
, Apr 8 2016