New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 876743 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Aug 27
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 1
Type: Bug



Sign in to add a comment

WebRTC Mac Tester (stats) bots failing

Project Member Reported by guidou@chromium.org, Aug 22

Issue description

WebRTC Mac Tester (stats) and FYI Mac Tester (stats) started failing at around the same time.


First failure for FYI:
https://build.chromium.org/deprecated/chromium.webrtc.fyi/builders/Mac%20Tester/builds/53647

First failure for WebRTC Chromium bot:
https://build.chromium.org/deprecated/chromium.webrtc/builders/Mac%20Tester/builds/82482

The FYI bot fails only on WebRtcApprtcBrowserTest.MANUAL_WorksOnApprtc. Sample logs:
 RUN      ] WebRtcApprtcBrowserTest.MANUAL_WorksOnApprtc
[11336:515:0821/233411.143817:3579233911747:FATAL:double_fork_and_exec.cc(107)] execvp /b/c/b/Mac_Tester/src/out/Release/Chromium.app/Contents/Versions/70.0.3531.0/Chromium Framework.framework/Helpers/crashpad_handler: No such file or directory (2)
0   browser_tests                       0x0000000109b9db0c base::debug::StackTrace::StackTrace(unsigned long) + 28
1   browser_tests                       0x0000000109af85c1 logging::LogMessage::~LogMessage() + 225
2   browser_tests                       0x0000000109af911c logging::ErrnoLogMessage::~ErrnoLogMessage() + 124
3   browser_tests                       0x000000010b489981 crashpad::DoubleForkAndExec(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, int, bool, void (*)()) + 785
4   browser_tests                       0x000000010b486f5f crashpad::(anonymous namespace)::HandlerStarter::CommonStart(base::FilePath const&, base::FilePath const&, base::FilePath const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > > const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, base::ScopedGeneric<unsigned int, base::mac::internal::ReceiveRightTraits>, crashpad::(anonymous namespace)::HandlerStarter*, bool) + 2415
5   browser_tests                       0x000000010b486123 crashpad::CrashpadClient::StartHandler(base::FilePath const&, base::FilePath const&, base::FilePath const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > > const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, bool, bool) + 563
6   browser_tests                       0x000000010cc286f8 crash_reporter::internal::PlatformCrashpadInitialization(bool, bool, bool, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, base::FilePath const&) + 856
7   browser_tests                       0x000000010cc28901 crash_reporter::InitializeCrashpad(bool, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 113
8   browser_tests                       0x000000010a99a527 ChromeMainDelegate::InitMacCrashReporter(base::CommandLine const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 135
9   browser_tests                       0x000000010a99a73e ChromeMainDelegate::PreSandboxStartup() + 174
10  browser_tests                       0x0000000109a7ac77 content::ContentMainRunnerImpl::Initialize(content::ContentMainParams const&) + 1111
11  browser_tests                       0x000000010bb4fd16 service_manager::Main(service_manager::MainParams const&) + 2566
12  browser_tests                       0x0000000109a7a484 content::ContentMain(content::ContentMainParams const&) + 68
13  browser_tests                       0x000000010a1797f9 content::BrowserTestBase::SetUp() + 2697
14  browser_tests                       0x0000000109bd33a3 InProcessBrowserTest::SetUp() + 723
15  browser_tests                       0x0000000107aa592d testing::Test::Run() + 109
16  browser_tests                       0x0000000107aa65c0 testing::TestInfo::Run() + 320
17  browser_tests                       0x0000000107aa6b37 testing::TestCase::Run() + 279
18  browser_tests                       0x0000000107ab2127 testing::internal::UnitTestImpl::RunAllTests() + 871
19  browser_tests                       0x0000000107ab1d9d testing::UnitTest::Run() + 109
20  browser_tests                       0x0000000109beda36 base::TestSuite::Run() + 166
21  browser_tests                       0x0000000109ad9a15 ChromeTestSuiteRunner::RunTestSuite(int, char**) + 37
22  browser_tests                       0x000000010a19c118 content::LaunchTests(content::TestLauncherDelegate*, unsigned long, int, char**) + 552
23  browser_tests                       0x0000000109ad9eed LaunchChromeTests(unsigned long, content::TestLauncherDelegate*, int, char**) + 333
24  browser_tests                       0x0000000109ad998e main + 94
25  libdyld.dylib                       0x00007fff6c090115 start + 1

The Chromium bot fails on many tests with uninformative stack traces.
Sample logs:
[ RUN      ] WebRtcAudioBrowserTest.EstablishAudioVideoCallAndEnsureAudioIsPlaying/1
Received signal 4 <unknown> 7fff88f43792
 [0x0001117bbb2c]
 [0x0001117bba21]
 [0x7fffa0e9fb3a]
 [0x7fca00608310]
 [0x7fff88f436c6]
 [0x7fff8901973f]
 [0x7fff8b1f0019]
 [0x7fffa03b1335]
 [0x7fff9f8a0d69]
 [0x7fff9f8a07de]
 [0x7fffa03af303]
 [0x7fff8b1efc55]
 [0x7fff88cbac28]
 [0x0001114ab7e9]
 [0x000110da6283]
 [0x000110dab7e1]
 [0x0001114aaf19]
 [0x00011146ad3f]
 [0x000110c9a352]
 [0x000112d70f71]
 [0x000110285264]
 [0x000111410709]
 [0x000111400778]
 [0x00010ffeff3d]
 [0x00010fff0bd0]
 [0x00010fff1147]
 [0x00010fffc737]
 [0x00010fffc3ad]
 [0x000111449a66]
 [0x0001114051ba]
 [0x00011142deb8]
 [0x000111405170]
 [0x7fffa0c90235]
 [0x00000000000a]
[end of stack trace]
 
Also note that I nothing in the WebRTC and Chromium blamelists for the first failure looks suspicious to me.
Cc: guidou@chromium.org oprypin@chromium.org
Labels: OS-Mac
Owner: sweilun@chromium.org
This is indeed really weird, but those two waterfalls starting to fail at the same time sends a very strong signal that it's something about a commit in Chromium. It could sometimes also be an infra commit but I don't see anything there.

Here are those two builds again, for visibility
https://ci.chromium.org/buildbot/chromium.webrtc/Mac%20Tester/82482
https://ci.chromium.org/buildbot/chromium.webrtc.fyi/Mac%20Tester/53647

sweilun@, could you confirm or deny the possibility of your CL affecting browser_tests on Mac?
https://chromium-review.googlesource.com/c/chromium/src/+/1181684

In any case... could we get a speculative revert? Just so we don't go looking for a problem where there isn't one...

Then you can see if the next build succeeds on these two:
https://ci.chromium.org/buildbot/chromium.webrtc/Mac%20Tester/
https://ci.chromium.org/buildbot/chromium.webrtc.fyi/Mac%20Tester/
Cc: kmilka@chromium.org
Cc: mbonadei@chromium.org
Owner: guidou@chromium.org
Cc: ramyan@chromium.org
guidou@: https://chromium-review.googlesource.com/1181684 was reverted & relanded, so it looks like that was determined to not be the cause?

(I don't know much about WebRTC tests, but the changes in 1181684 seem unlikely to cause issues there).
ramyan@: That is correct. It was a speculative revert since it was the only CL in the blamelist, but the revert did not fix the bots and the CL was relanded.
Thanks for the confirmation guidou@!
I was able to reproduce the test failures locally with a ToT of today. Then I started to bisect to find where the problem started, but was never able to reproduce again, even with the version that had initially failed.

Maybe this was a temporary tool issue (perhaps some compiler bug) that got fixed at some point and the bots are still using the faulty one?

WRT to WebRTC rolls into Chrome, perhaps we should start a manual one, or disable the bots so the autoroller can resume automatic ones.
Also note that I didn't find any CLs around the time when the failure started related with the stack traces I was getting when I was able to reproduce.

I looked if there's any way to disable crashpad, but I couldn't find a convenient flag for it.

We have plenty of tests that run the whole chrome browser, so I wonder why the apprtc tests in particular are affected. One guess is that we invoke browser_tests twice on these bots in two different build steps, and the first invocation gets the crash reporter binary into a weird state.

Yes, I can see that the large tests bot fails in the same way: if we invoke browser_tests in different build steps on the same bot, all steps but the first fail.
I can't see any changes in recipes, breakpad or crashpad around Aug 22... https://chromium-review.googlesource.com/1181684 is the only one in the chromium blamelist, and it's not that one.

Also unexplained is the content browsertest crashes.
Cc: olka@chromium.org
Owner: ----
Owner: phoglund@chromium.org
Status: Assigned (was: Untriaged)
I can see the failed build and successful one differs by one //build CL for chromium.webrtc:
https://chromium.googlesource.com/chromium/tools/build/+/dc454099775dcec7ba4c31dd0fb0eb4c9447b9e2. For FYI, however, //build stays the same so it's most likely not that one.

The error is execvp /b/c/b/Mac_Tester/src/out/Release/Chromium.app/Contents/Versions/70.0.3531.0/Chromium Framework.framework/Helpers/crashpad_handler: No such file or directory (2)

I wonder if the path gets misconstructed or if the binary is accidentally deleted or something.
I think the code to put the handler there in the first place is here: https://cs.chromium.org/chromium/src/chrome/BUILD.gn?type=cs&q=%22crashpad_handler%22&sq=package:chromium&g=0&l=866

When I logged into the bot, I could see everything in the path execvp /b/c/b/Mac_Tester/src/out/Release/Chromium.app/Contents/Versions/70.0.X.X/Chromium Framework.framework/ is there, but not Helpers/crashpad_handler. Looks like the code that asserts is WAI.

For chromium.webrtc it seems the binary is gone when the tests start executing and for fyi it goes away after the first browser_tests invocation.
Aha, I did find the binary here in a slightly different path though:

/b/c/b/Mac_Tester/src/out/Release/Chromium.app/Contents/Versions/70.0.3535.0/Chromium\ Framework.framework/Versions/A/Helpers/crashpad_handler
All right, another theory: AppRTC tests and perf tests happen to be the only ones that aren't swarmed on that bot, so the swarming tasks obviously get the right build, but the tests that run on the bot itself doesn't. I'll look closer at the build that comes with the bot and compare with what the swarmed shards are getting.
The full mac build has the same dir problem as described in #16 and #17, but the swarmed equivalent DOES have out/Release/Chromium.app/Contents/Versions/70.0.3531.0/Chromium\ Framework.framework/Helpers/crashpad_handler.

So this is a build packaging problem.
I can indeed see that the build https://ci.chromium.org/buildbot/chromium.webrtc/Mac%20Builder/101166 does have a functioning Helpers/breakpad. It turns out Helpers is actually a symlink into Versions/Current/Helpers/. 

I can see breakpad is the same between https://logs.chromium.org/v/?s=chromium%2Fbb%2Fchromium.webrtc%2FMac_Builder%2F101166%2F%2B%2Frecipes%2Fsteps%2Fpackage_build%2F0%2Fstdout (OK) and https://logs.chromium.org/v/?s=chromium%2Fbb%2Fchromium.webrtc%2FMac_Builder%2F101167%2F%2B%2Frecipes%2Fsteps%2Fpackage_build%2F0%2Fstdout (BAD), however "Chromium Framework.framework/Versions/Current" is added in the good version and not the bad one. New theory: the Versions/Current symlink has been broken somehow in packaged builds.
Yeah, the bad one has Versions/A and the good one has

Versions/
  A/
  Current -> A/
Cc: p...@chromium.org
Aha, this one is most likely the problem: https://chromium.googlesource.com/chromium/tools/build/+/dc454099775dcec7ba4c31dd0fb0eb4c9447b9e2

I'm going to speculatively revert it for now so we know if it's the culprit or not.
Project Member

Comment 24 by bugdroid1@chromium.org, Aug 27

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build/+/9a41d66725951f98d7c3a425e418ca8155abf00e

commit 9a41d66725951f98d7c3a425e418ca8155abf00e
Author: Patrik Höglund <phoglund@chromium.org>
Date: Mon Aug 27 10:16:16 2018

Revert "zip_build: Apply the path filter recursively when creating a package archive."

This reverts commit dc454099775dcec7ba4c31dd0fb0eb4c9447b9e2.

Reason for revert: Speculative revert: looks like this breaks WebRTC build packaging for browser_tests.

Original change's description:
> zip_build: Apply the path filter recursively when creating a package archive.
>
> Previously we would only apply it to the direct descendants of the
> build directory and not the contents of any of its subdirectories. This
> meant that, for example, we ended up packaging the 'obj' and
> 'thinlto-cache' directories from the build directories for toolchains
> other than the default toolchain, despite these directories being
> filtered out.
>
> We previously had code that handled the 'initial' directory as a
> special case where the filter was also being applied to the files
> in that directory. It seems like this code is now dead because the
> directory is now named 'initialexe' (?), but in any event, since we
> now have a generalization of that code, I've removed it.
>
> Bug:  876316 
> Change-Id: Ieab27788bf3ca7c7bf970434ac491a053eaa5baf
> Reviewed-on: https://chromium-review.googlesource.com/1184302
> Reviewed-by: Nico Weber <thakis@chromium.org>
> Commit-Queue: Peter Collingbourne <pcc@chromium.org>

TBR=thakis@chromium.org,dpranke@chromium.org,pcc@chromium.org

# Not skipping CQ checks because original CL landed > 1 day ago.

Bug:  876316 ,  876743 
Change-Id: Ie229755a4836e6e80d896801535b3d5ad8a98650
Reviewed-on: https://chromium-review.googlesource.com/1189962
Commit-Queue: Patrik Höglund <phoglund@chromium.org>
Reviewed-by: Patrik Höglund <phoglund@chromium.org>

[modify] https://crrev.com/9a41d66725951f98d7c3a425e418ca8155abf00e/scripts/slave/zip_build.py
[modify] https://crrev.com/9a41d66725951f98d7c3a425e418ca8155abf00e/scripts/common/chromium_utils.py

And it greens up! Yay!

pcc: The problem here was that a symlink doesn't get zipped into the mac release build.

Before: out/Release/Chromium.app/Contents/Versions/70.0.X.X/Chromium Framework.framework/Helpers/ -> Chromium.app/Contents/Versions/70.0.X.X/Chromium Framework.framework/Versions/Current/Helpers

After your patch: 
out/Release/Chromium.app/Contents/Versions/70.0.X.X/Chromium Framework.framework/Helpers/ MISSING
Chromium.app/Contents/Versions/70.0.X.X/Chromium Framework.framework/Versions/Current MISSING

I think both the Helpers symlink itself and Versions/Current go missing, so the path out/Release/Chromium.app/Contents/Versions/70.0.X.X/Chromium Framework.framework/Helpers/crashpad_handler fails to resolve.

This obviously works on swarming. Unfortunately we have some tests that aren't converted to swarming. I plan to do that over the coming months but I can try to accelerate those plans if it's hard for you to fix+reland your CL.
Status: Fixed (was: Assigned)
Project Member

Comment 27 by bugdroid1@chromium.org, Aug 28

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build/+/d550bb44e12759d593d991a7a6f916af5d90b414

commit d550bb44e12759d593d991a7a6f916af5d90b414
Author: Peter Collingbourne <pcc@google.com>
Date: Tue Aug 28 16:47:23 2018

Reland "zip_build: Apply the path filter recursively when creating a package archive."

This relands commit dc454099775dcec7ba4c31dd0fb0eb4c9447b9e2.

The original commit introduced a bug where symlinks that point to
directories were omitted from the archive. The cause was that for
some reason os.walk returns symbolic links that point to directories
in dirs and not files, so we need to handle any symlinks in dirs as
if they appeared in files instead.

Original change's description:
> Revert "zip_build: Apply the path filter recursively when creating a package archive."
>
> This reverts commit dc454099775dcec7ba4c31dd0fb0eb4c9447b9e2.
>
> Reason for revert: Speculative revert: looks like this breaks WebRTC build packaging for browser_tests.
>
> Original change's description:
> > zip_build: Apply the path filter recursively when creating a package archive.
> >
> > Previously we would only apply it to the direct descendants of the
> > build directory and not the contents of any of its subdirectories. This
> > meant that, for example, we ended up packaging the 'obj' and
> > 'thinlto-cache' directories from the build directories for toolchains
> > other than the default toolchain, despite these directories being
> > filtered out.
> >
> > We previously had code that handled the 'initial' directory as a
> > special case where the filter was also being applied to the files
> > in that directory. It seems like this code is now dead because the
> > directory is now named 'initialexe' (?), but in any event, since we
> > now have a generalization of that code, I've removed it.
> >
> > Bug:  876316 
> > Change-Id: Ieab27788bf3ca7c7bf970434ac491a053eaa5baf
> > Reviewed-on: https://chromium-review.googlesource.com/1184302
> > Reviewed-by: Nico Weber <thakis@chromium.org>
> > Commit-Queue: Peter Collingbourne <pcc@chromium.org>
>
> TBR=thakis@chromium.org,dpranke@chromium.org,pcc@chromium.org
>
> # Not skipping CQ checks because original CL landed > 1 day ago.
>
> Bug:  876316 ,  876743 
> Change-Id: Ie229755a4836e6e80d896801535b3d5ad8a98650
> Reviewed-on: https://chromium-review.googlesource.com/1189962
> Commit-Queue: Patrik Höglund <phoglund@chromium.org>
> Reviewed-by: Patrik Höglund <phoglund@chromium.org>

Bug:  876316 ,  876743 
Change-Id: Ia6c329c797374e24d6bcf43d0cadd6517deb27b8
Reviewed-on: https://chromium-review.googlesource.com/1192234
Reviewed-by: Nico Weber <thakis@chromium.org>
Commit-Queue: Peter Collingbourne <pcc@chromium.org>

[modify] https://crrev.com/d550bb44e12759d593d991a7a6f916af5d90b414/scripts/slave/zip_build.py
[modify] https://crrev.com/d550bb44e12759d593d991a7a6f916af5d90b414/scripts/common/chromium_utils.py

Sign in to add a comment