WebRTC Mac Tester (stats) bots failing |
||||||||||
Issue descriptionWebRTC Mac Tester (stats) and FYI Mac Tester (stats) started failing at around the same time. First failure for FYI: https://build.chromium.org/deprecated/chromium.webrtc.fyi/builders/Mac%20Tester/builds/53647 First failure for WebRTC Chromium bot: https://build.chromium.org/deprecated/chromium.webrtc/builders/Mac%20Tester/builds/82482 The FYI bot fails only on WebRtcApprtcBrowserTest.MANUAL_WorksOnApprtc. Sample logs: RUN ] WebRtcApprtcBrowserTest.MANUAL_WorksOnApprtc [11336:515:0821/233411.143817:3579233911747:FATAL:double_fork_and_exec.cc(107)] execvp /b/c/b/Mac_Tester/src/out/Release/Chromium.app/Contents/Versions/70.0.3531.0/Chromium Framework.framework/Helpers/crashpad_handler: No such file or directory (2) 0 browser_tests 0x0000000109b9db0c base::debug::StackTrace::StackTrace(unsigned long) + 28 1 browser_tests 0x0000000109af85c1 logging::LogMessage::~LogMessage() + 225 2 browser_tests 0x0000000109af911c logging::ErrnoLogMessage::~ErrnoLogMessage() + 124 3 browser_tests 0x000000010b489981 crashpad::DoubleForkAndExec(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, int, bool, void (*)()) + 785 4 browser_tests 0x000000010b486f5f crashpad::(anonymous namespace)::HandlerStarter::CommonStart(base::FilePath const&, base::FilePath const&, base::FilePath const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > > const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, base::ScopedGeneric<unsigned int, base::mac::internal::ReceiveRightTraits>, crashpad::(anonymous namespace)::HandlerStarter*, bool) + 2415 5 browser_tests 0x000000010b486123 crashpad::CrashpadClient::StartHandler(base::FilePath const&, base::FilePath const&, base::FilePath const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > > const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, bool, bool) + 563 6 browser_tests 0x000000010cc286f8 crash_reporter::internal::PlatformCrashpadInitialization(bool, bool, bool, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, base::FilePath const&) + 856 7 browser_tests 0x000000010cc28901 crash_reporter::InitializeCrashpad(bool, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 113 8 browser_tests 0x000000010a99a527 ChromeMainDelegate::InitMacCrashReporter(base::CommandLine const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 135 9 browser_tests 0x000000010a99a73e ChromeMainDelegate::PreSandboxStartup() + 174 10 browser_tests 0x0000000109a7ac77 content::ContentMainRunnerImpl::Initialize(content::ContentMainParams const&) + 1111 11 browser_tests 0x000000010bb4fd16 service_manager::Main(service_manager::MainParams const&) + 2566 12 browser_tests 0x0000000109a7a484 content::ContentMain(content::ContentMainParams const&) + 68 13 browser_tests 0x000000010a1797f9 content::BrowserTestBase::SetUp() + 2697 14 browser_tests 0x0000000109bd33a3 InProcessBrowserTest::SetUp() + 723 15 browser_tests 0x0000000107aa592d testing::Test::Run() + 109 16 browser_tests 0x0000000107aa65c0 testing::TestInfo::Run() + 320 17 browser_tests 0x0000000107aa6b37 testing::TestCase::Run() + 279 18 browser_tests 0x0000000107ab2127 testing::internal::UnitTestImpl::RunAllTests() + 871 19 browser_tests 0x0000000107ab1d9d testing::UnitTest::Run() + 109 20 browser_tests 0x0000000109beda36 base::TestSuite::Run() + 166 21 browser_tests 0x0000000109ad9a15 ChromeTestSuiteRunner::RunTestSuite(int, char**) + 37 22 browser_tests 0x000000010a19c118 content::LaunchTests(content::TestLauncherDelegate*, unsigned long, int, char**) + 552 23 browser_tests 0x0000000109ad9eed LaunchChromeTests(unsigned long, content::TestLauncherDelegate*, int, char**) + 333 24 browser_tests 0x0000000109ad998e main + 94 25 libdyld.dylib 0x00007fff6c090115 start + 1 The Chromium bot fails on many tests with uninformative stack traces. Sample logs: [ RUN ] WebRtcAudioBrowserTest.EstablishAudioVideoCallAndEnsureAudioIsPlaying/1 Received signal 4 <unknown> 7fff88f43792 [0x0001117bbb2c] [0x0001117bba21] [0x7fffa0e9fb3a] [0x7fca00608310] [0x7fff88f436c6] [0x7fff8901973f] [0x7fff8b1f0019] [0x7fffa03b1335] [0x7fff9f8a0d69] [0x7fff9f8a07de] [0x7fffa03af303] [0x7fff8b1efc55] [0x7fff88cbac28] [0x0001114ab7e9] [0x000110da6283] [0x000110dab7e1] [0x0001114aaf19] [0x00011146ad3f] [0x000110c9a352] [0x000112d70f71] [0x000110285264] [0x000111410709] [0x000111400778] [0x00010ffeff3d] [0x00010fff0bd0] [0x00010fff1147] [0x00010fffc737] [0x00010fffc3ad] [0x000111449a66] [0x0001114051ba] [0x00011142deb8] [0x000111405170] [0x7fffa0c90235] [0x00000000000a] [end of stack trace]
,
Aug 22
This is indeed really weird, but those two waterfalls starting to fail at the same time sends a very strong signal that it's something about a commit in Chromium. It could sometimes also be an infra commit but I don't see anything there. Here are those two builds again, for visibility https://ci.chromium.org/buildbot/chromium.webrtc/Mac%20Tester/82482 https://ci.chromium.org/buildbot/chromium.webrtc.fyi/Mac%20Tester/53647 sweilun@, could you confirm or deny the possibility of your CL affecting browser_tests on Mac? https://chromium-review.googlesource.com/c/chromium/src/+/1181684 In any case... could we get a speculative revert? Just so we don't go looking for a problem where there isn't one... Then you can see if the next build succeeds on these two: https://ci.chromium.org/buildbot/chromium.webrtc/Mac%20Tester/ https://ci.chromium.org/buildbot/chromium.webrtc.fyi/Mac%20Tester/
,
Aug 22
,
Aug 24
,
Aug 24
,
Aug 24
,
Aug 24
guidou@: https://chromium-review.googlesource.com/1181684 was reverted & relanded, so it looks like that was determined to not be the cause? (I don't know much about WebRTC tests, but the changes in 1181684 seem unlikely to cause issues there).
,
Aug 24
ramyan@: That is correct. It was a speculative revert since it was the only CL in the blamelist, but the revert did not fix the bots and the CL was relanded.
,
Aug 24
Thanks for the confirmation guidou@!
,
Aug 24
I was able to reproduce the test failures locally with a ToT of today. Then I started to bisect to find where the problem started, but was never able to reproduce again, even with the version that had initially failed. Maybe this was a temporary tool issue (perhaps some compiler bug) that got fixed at some point and the bots are still using the faulty one? WRT to WebRTC rolls into Chrome, perhaps we should start a manual one, or disable the bots so the autoroller can resume automatic ones.
,
Aug 24
Also note that I didn't find any CLs around the time when the failure started related with the stack traces I was getting when I was able to reproduce.
,
Aug 27
I looked if there's any way to disable crashpad, but I couldn't find a convenient flag for it. We have plenty of tests that run the whole chrome browser, so I wonder why the apprtc tests in particular are affected. One guess is that we invoke browser_tests twice on these bots in two different build steps, and the first invocation gets the crash reporter binary into a weird state. Yes, I can see that the large tests bot fails in the same way: if we invoke browser_tests in different build steps on the same bot, all steps but the first fail.
,
Aug 27
I can't see any changes in recipes, breakpad or crashpad around Aug 22... https://chromium-review.googlesource.com/1181684 is the only one in the chromium blamelist, and it's not that one. Also unexplained is the content browsertest crashes.
,
Aug 27
,
Aug 27
I can see the failed build and successful one differs by one //build CL for chromium.webrtc: https://chromium.googlesource.com/chromium/tools/build/+/dc454099775dcec7ba4c31dd0fb0eb4c9447b9e2. For FYI, however, //build stays the same so it's most likely not that one. The error is execvp /b/c/b/Mac_Tester/src/out/Release/Chromium.app/Contents/Versions/70.0.3531.0/Chromium Framework.framework/Helpers/crashpad_handler: No such file or directory (2) I wonder if the path gets misconstructed or if the binary is accidentally deleted or something.
,
Aug 27
I think the code to put the handler there in the first place is here: https://cs.chromium.org/chromium/src/chrome/BUILD.gn?type=cs&q=%22crashpad_handler%22&sq=package:chromium&g=0&l=866 When I logged into the bot, I could see everything in the path execvp /b/c/b/Mac_Tester/src/out/Release/Chromium.app/Contents/Versions/70.0.X.X/Chromium Framework.framework/ is there, but not Helpers/crashpad_handler. Looks like the code that asserts is WAI. For chromium.webrtc it seems the binary is gone when the tests start executing and for fyi it goes away after the first browser_tests invocation.
,
Aug 27
Aha, I did find the binary here in a slightly different path though: /b/c/b/Mac_Tester/src/out/Release/Chromium.app/Contents/Versions/70.0.3535.0/Chromium\ Framework.framework/Versions/A/Helpers/crashpad_handler
,
Aug 27
The path is constructed here in the .cc code: https://cs.chromium.org/chromium/src/components/crash/content/app/crashpad_mac.mm?type=cs&sq=package:chromium&g=0&l=123
,
Aug 27
All right, another theory: AppRTC tests and perf tests happen to be the only ones that aren't swarmed on that bot, so the swarming tasks obviously get the right build, but the tests that run on the bot itself doesn't. I'll look closer at the build that comes with the bot and compare with what the swarmed shards are getting.
,
Aug 27
The full mac build has the same dir problem as described in #16 and #17, but the swarmed equivalent DOES have out/Release/Chromium.app/Contents/Versions/70.0.3531.0/Chromium\ Framework.framework/Helpers/crashpad_handler. So this is a build packaging problem.
,
Aug 27
I can indeed see that the build https://ci.chromium.org/buildbot/chromium.webrtc/Mac%20Builder/101166 does have a functioning Helpers/breakpad. It turns out Helpers is actually a symlink into Versions/Current/Helpers/. I can see breakpad is the same between https://logs.chromium.org/v/?s=chromium%2Fbb%2Fchromium.webrtc%2FMac_Builder%2F101166%2F%2B%2Frecipes%2Fsteps%2Fpackage_build%2F0%2Fstdout (OK) and https://logs.chromium.org/v/?s=chromium%2Fbb%2Fchromium.webrtc%2FMac_Builder%2F101167%2F%2B%2Frecipes%2Fsteps%2Fpackage_build%2F0%2Fstdout (BAD), however "Chromium Framework.framework/Versions/Current" is added in the good version and not the bad one. New theory: the Versions/Current symlink has been broken somehow in packaged builds.
,
Aug 27
Yeah, the bad one has Versions/A and the good one has Versions/ A/ Current -> A/
,
Aug 27
Aha, this one is most likely the problem: https://chromium.googlesource.com/chromium/tools/build/+/dc454099775dcec7ba4c31dd0fb0eb4c9447b9e2 I'm going to speculatively revert it for now so we know if it's the culprit or not.
,
Aug 27
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/9a41d66725951f98d7c3a425e418ca8155abf00e commit 9a41d66725951f98d7c3a425e418ca8155abf00e Author: Patrik Höglund <phoglund@chromium.org> Date: Mon Aug 27 10:16:16 2018 Revert "zip_build: Apply the path filter recursively when creating a package archive." This reverts commit dc454099775dcec7ba4c31dd0fb0eb4c9447b9e2. Reason for revert: Speculative revert: looks like this breaks WebRTC build packaging for browser_tests. Original change's description: > zip_build: Apply the path filter recursively when creating a package archive. > > Previously we would only apply it to the direct descendants of the > build directory and not the contents of any of its subdirectories. This > meant that, for example, we ended up packaging the 'obj' and > 'thinlto-cache' directories from the build directories for toolchains > other than the default toolchain, despite these directories being > filtered out. > > We previously had code that handled the 'initial' directory as a > special case where the filter was also being applied to the files > in that directory. It seems like this code is now dead because the > directory is now named 'initialexe' (?), but in any event, since we > now have a generalization of that code, I've removed it. > > Bug: 876316 > Change-Id: Ieab27788bf3ca7c7bf970434ac491a053eaa5baf > Reviewed-on: https://chromium-review.googlesource.com/1184302 > Reviewed-by: Nico Weber <thakis@chromium.org> > Commit-Queue: Peter Collingbourne <pcc@chromium.org> TBR=thakis@chromium.org,dpranke@chromium.org,pcc@chromium.org # Not skipping CQ checks because original CL landed > 1 day ago. Bug: 876316 , 876743 Change-Id: Ie229755a4836e6e80d896801535b3d5ad8a98650 Reviewed-on: https://chromium-review.googlesource.com/1189962 Commit-Queue: Patrik Höglund <phoglund@chromium.org> Reviewed-by: Patrik Höglund <phoglund@chromium.org> [modify] https://crrev.com/9a41d66725951f98d7c3a425e418ca8155abf00e/scripts/slave/zip_build.py [modify] https://crrev.com/9a41d66725951f98d7c3a425e418ca8155abf00e/scripts/common/chromium_utils.py
,
Aug 27
And it greens up! Yay! pcc: The problem here was that a symlink doesn't get zipped into the mac release build. Before: out/Release/Chromium.app/Contents/Versions/70.0.X.X/Chromium Framework.framework/Helpers/ -> Chromium.app/Contents/Versions/70.0.X.X/Chromium Framework.framework/Versions/Current/Helpers After your patch: out/Release/Chromium.app/Contents/Versions/70.0.X.X/Chromium Framework.framework/Helpers/ MISSING Chromium.app/Contents/Versions/70.0.X.X/Chromium Framework.framework/Versions/Current MISSING I think both the Helpers symlink itself and Versions/Current go missing, so the path out/Release/Chromium.app/Contents/Versions/70.0.X.X/Chromium Framework.framework/Helpers/crashpad_handler fails to resolve. This obviously works on swarming. Unfortunately we have some tests that aren't converted to swarming. I plan to do that over the coming months but I can try to accelerate those plans if it's hard for you to fix+reland your CL.
,
Aug 27
,
Aug 28
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/d550bb44e12759d593d991a7a6f916af5d90b414 commit d550bb44e12759d593d991a7a6f916af5d90b414 Author: Peter Collingbourne <pcc@google.com> Date: Tue Aug 28 16:47:23 2018 Reland "zip_build: Apply the path filter recursively when creating a package archive." This relands commit dc454099775dcec7ba4c31dd0fb0eb4c9447b9e2. The original commit introduced a bug where symlinks that point to directories were omitted from the archive. The cause was that for some reason os.walk returns symbolic links that point to directories in dirs and not files, so we need to handle any symlinks in dirs as if they appeared in files instead. Original change's description: > Revert "zip_build: Apply the path filter recursively when creating a package archive." > > This reverts commit dc454099775dcec7ba4c31dd0fb0eb4c9447b9e2. > > Reason for revert: Speculative revert: looks like this breaks WebRTC build packaging for browser_tests. > > Original change's description: > > zip_build: Apply the path filter recursively when creating a package archive. > > > > Previously we would only apply it to the direct descendants of the > > build directory and not the contents of any of its subdirectories. This > > meant that, for example, we ended up packaging the 'obj' and > > 'thinlto-cache' directories from the build directories for toolchains > > other than the default toolchain, despite these directories being > > filtered out. > > > > We previously had code that handled the 'initial' directory as a > > special case where the filter was also being applied to the files > > in that directory. It seems like this code is now dead because the > > directory is now named 'initialexe' (?), but in any event, since we > > now have a generalization of that code, I've removed it. > > > > Bug: 876316 > > Change-Id: Ieab27788bf3ca7c7bf970434ac491a053eaa5baf > > Reviewed-on: https://chromium-review.googlesource.com/1184302 > > Reviewed-by: Nico Weber <thakis@chromium.org> > > Commit-Queue: Peter Collingbourne <pcc@chromium.org> > > TBR=thakis@chromium.org,dpranke@chromium.org,pcc@chromium.org > > # Not skipping CQ checks because original CL landed > 1 day ago. > > Bug: 876316 , 876743 > Change-Id: Ie229755a4836e6e80d896801535b3d5ad8a98650 > Reviewed-on: https://chromium-review.googlesource.com/1189962 > Commit-Queue: Patrik Höglund <phoglund@chromium.org> > Reviewed-by: Patrik Höglund <phoglund@chromium.org> Bug: 876316 , 876743 Change-Id: Ia6c329c797374e24d6bcf43d0cadd6517deb27b8 Reviewed-on: https://chromium-review.googlesource.com/1192234 Reviewed-by: Nico Weber <thakis@chromium.org> Commit-Queue: Peter Collingbourne <pcc@chromium.org> [modify] https://crrev.com/d550bb44e12759d593d991a7a6f916af5d90b414/scripts/slave/zip_build.py [modify] https://crrev.com/d550bb44e12759d593d991a7a6f916af5d90b414/scripts/common/chromium_utils.py |
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by guidou@chromium.org
, Aug 22