New issue
Advanced search Search tips

Issue 736783 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Jun 2017
EstimatedDays: ----
NextAction: ----
OS: Fuchsia
Pri: 3
Type: Bug

Blocking:
issue 706592



Sign in to add a comment

Failure in OpenFile() in test harness later causes crash in fflush() on Fuchsia

Project Member Reported by scottmg@chromium.org, Jun 26 2017

Issue description

Backtrace from bot failure on https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Fchromium.fyi%2FFuchsia%2F5513%2F%2B%2Frecipes%2Fsteps%2Fbase_unittests%2F0%2Fstdout

which is here:

https://cs.chromium.org/chromium/src/base/test/gtest_xml_unittest_result_printer.cc?rcl=fb7942c7ff5bb9b0282483a5159c86537a2cfeb9&l=150

The run script doesn't know how to symbolize libc yet, but looks like it's calling fflush().

output_file

[00098.987] 01123.01174> <== fatal exception: process [59308] thread [59403]
[00098.987] 01123.01174> <== fatal page fault, PC at 0x436ab71c7218
[00098.989] 01123.01174>  CS:                   0 RIP:     0x436ab71c7218 EFL:              0x246 CR2:               0x8c
[00098.989] 01123.01174>  RAX:                  0 RBX:     0x123ecc17fa40 RCX:               0x43 RDX:               0x32
[00098.989] 01123.01174>  RSI:                0x9 RDI:     0x436ab723ed40 RBP:     0x2ded18a9af18 RSP:     0x2ded18a9af00
[00098.989] 01123.01174>   R8:                  0  R9:                  0 R10:               0x20 R11:                  0
[00098.989] 01123.01174>  R12:     0x2ded18a9afb0 R13:     0x2ded18a9af30 R14:     0x32864f18e3c0 R15:                  0
[00098.989] 01123.01174>  errc:               0x4
[00098.989] 01123.01174> bottom of user stack:
[00098.990] 01123.01174> 0x00002ded18a9af00: 00000000 00000000 00000000 00000000 |................|
[00098.990] 01123.01174> 0x00002ded18a9af10: 00000000 00000000 00000010 00000030 |............0...|
[00098.990] 01123.01174> 0x00002ded18a9af20: 18a9b170 00002ded 18a9b0b0 00002ded |p....-.......-..|
[00098.990] 01123.01174> 0x00002ded18a9af30: 00000000 00000000 00000000 00000000 |................|
[00098.990] 01123.01174> 0x00002ded18a9af40: 00000000 00000000 00000000 00000000 |................|
[00098.990] 01123.01174> 0x00002ded18a9af50: 00000000 00000000 18a9b1f4 00002ded |.............-..|
[00098.990] 01123.01174> 0x00002ded18a9af60: 00000014 00000000 b71c9ca7 0000436a |............jC..|
[00098.990] 01123.01174> 0x00002ded18a9af70: 18a9b7f0 00002ded b71f77ae 0000436a |.....-...w..jC..|
[00098.990] 01123.01174> 0x00002ded18a9af80: 00000000 00000000 00000000 00000000 |................|
[00098.990] 01123.01174> 0x00002ded18a9af90: 00000000 00000000 00000000 00000000 |................|
[00098.990] 01123.01174> 0x00002ded18a9afa0: 18a9b1f4 00002ded 18a9b1ee 00002ded |.....-.......-..|
[00098.991] 01123.01174> 0x00002ded18a9afb0: 00000000 00000000 18a9b1f4 00002ded |.............-..|
[00098.991] 01123.01174> 0x00002ded18a9afc0: 00000000 00000000 b71c9be0 0000436a |............jC..|
[00098.991] 01123.01174> 0x00002ded18a9afd0: 00000000 00000000 18a9b1e0 00002ded |.............-..|
[00098.991] 01123.01174> 0x00002ded18a9afe0: 00000014 00000000 00000000 00000000 |................|
[00098.991] 01123.01174> 0x00002ded18a9aff0: 00000000 00000000 00000000 00000000 |................|
[00098.991] 01123.01174> arch: x86_64
[00099.014] 01123.01174> dso: id=05a640477f586007 base=0x7814e597f000 name=libc++abi.so.1
[00099.014] 01123.01174> dso: id=2b6882f235d28d75ef453da68f39774ef99c8388 base=0x744cc6132000 name=<vDSO>
[00099.014] 01123.01174> dso: id=aeae5a3cc3b99502 base=0x605a2ef59000 name=libunwind.so.1
[00099.014] 01123.01174> dso: id=7232e5e96679f46409d5b78b2b94bccadd2de16a base=0x436ab7184000 name=libc.so
[00099.014] 01123.01174> dso: id=6682d527b69ec892f730643f7ff711e8f78f4116 base=0x3345506de000 name=liblaunchpad.so
[00099.014] 01123.01174> dso: id=765f3f6ec08e043f base=0x2e47e654e000 name=libc++.so.2
[00099.014] 01123.01174> dso: id=846fc8bf1dc97c41 base=0x123ecc117000 name=app:/system/base_unittests
[00099.014] 01123.01174> dso: id=0ee2c1798ac9282426341de1bfb7f8d509010a73 base=0x115970a6a000 name=libmxio.so
[00099.029] 01123.01174> bt#01: pc 0x436ab71c7218 sp 0x2ded18a9af00 (libc.so,0x43218)
[00099.041] 01123.01174> bt#02: pc 0x436ab71c2607 sp 0x2ded18a9b090 (libc.so,0x3e607)
[00099.041] 01123.01174> bt#03: pc 0x123eccc4aa80 sp 0x2ded18a9b170 (app:/system/base_unittests,0xb33a80)
[00099.045] 01123.01174> bt#04: pc 0x123eccc4a043 sp 0x2ded18a9b210 (app:/system/base_unittests,0xb33043)
[00099.046] 01123.01174> bt#05: pc 0x123eccc4a82e sp 0x2ded18a9b2b0 (app:/system/base_unittests,0xb3382e)
[00099.046] 01123.01174> bt#06: pc 0x123eccbaf300 sp 0x2ded18a9b2d0 (app:/system/base_unittests,0xa98300)
[00099.047] 01123.01174> bt#07: pc 0x123eccc49e33 sp 0x2ded18a9b770 (app:/system/base_unittests,0xb32e33)
[00099.048] 01123.01174> bt#08: pc 0x123eccc4a3a9 sp 0x2ded18a9b8f0 (app:/system/base_unittests,0xb333a9)
[00099.049] 01123.01174> bt#09: pc 0x123eccc49a3f sp 0x2ded18a9ba90 (app:/system/base_unittests,0xb32a3f)
[00099.070] 01123.01174> bt#10: pc 0x123eccc537dc sp 0x2ded18a9bad0 (app:/system/base_unittests,0xb3c7dc)
[00099.072] 01123.01174> bt#11: pc 0x123eccc5369a sp 0x2ded18a9bf20 (app:/system/base_unittests,0xb3c69a)
[00099.073] 01123.01174> bt#12: pc 0x123eccc3e274 sp 0x2ded18a9bf60 (app:/system/base_unittests,0xb27274)
[00099.073] 01123.01174> bt#13: pc 0x436ab719ffed sp 0x2ded18a9bff0 (libc.so,0x1bfed)
[00099.075] 01123.01174> bt#14: pc 0 sp 0x2ded18a9c000
[00099.076] 01123.01174> bt#15: end
----- start symbolized stack
#01: ?? ??:0
#02: ?? ??:0
#03: base::XmlUnitTestResultPrinter::WriteTestPartResult(char const*, int, testing::TestPartResult::Type, std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char> > const&, std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char> > const&) at /b/c/builder/linux/src/out/Release/../../base/test/gtest_xml_unittest_result_printer.cc:150
#04: std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char> >::__is_long() const at /b/c/builder/linux/src/out/Release/../../third_party/fuchsia-sdk/sysroot/x86_64-fuchsia/include/c++/v1/string:1226
      (inlined by) ~basic_string at /b/c/builder/linux/src/out/Release/../../third_party/fuchsia-sdk/sysroot/x86_64-fuchsia/include/c++/v1/string:1911
      (inlined by) base::TestSuite::UnitTestAssertHandler(char const*, int, base::BasicStringPiece<std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char> > >, base::BasicStringPiece<std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char> > >) at /b/c/builder/linux/src/out/Release/../../base/test/test_suite.cc:311
#05: base::internal::Invoker<base::internal::BindState<void (base::TestSuite::*)(char const*, int, base::BasicStringPiece<std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char> > >, base::BasicStringPiece<std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char> > >), base::internal::UnretainedWrapper<base::TestSuite> >, void (char const*, int, base::BasicStringPiece<std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char> > >, base::BasicStringPiece<std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char> > >)>::Run(base::internal::BindStateBase*, char const*&&, int&&, base::BasicStringPiece<std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char> > >&&, base::BasicStringPiece<std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char> > >&&) at /b/c/builder/linux/src/out/Release/../../base/bind_internal.h:333
#06: ~LogMessage at /b/c/builder/linux/src/out/Release/../../base/callback.h:?
#07: base::TestSuite::AddTestLauncherResultPrinter() at /b/c/builder/linux/src/out/Release/../../base/test/test_suite.cc:244
#08: base::TestSuite::Initialize() at /b/c/builder/linux/src/out/Release/../../base/test/test_suite.cc:406
#09: base::TestSuite::Run() at /b/c/builder/linux/src/out/Release/../../base/test/test_suite.cc:261
#10: base::Callback<int (), (base::internal::CopyMode)1, (base::internal::RepeatMode)1>::Run() const & at /b/c/builder/linux/src/out/Release/../../base/callback.h:80
      (inlined by) base::(anonymous namespace)::LaunchUnitTestsInternal(base::Callback<int (), (base::internal::CopyMode)1, (base::internal::RepeatMode)1> const&, unsigned long, int, bool, base::Callback<void (), (base::internal::CopyMode)1, (base::internal::RepeatMode)1> const&) at /b/c/builder/linux/src/out/Release/../../base/test/launcher/unit_test_launcher.cc:216
#11: base::LaunchUnitTests(int, char**, base::Callback<int (), (base::internal::CopyMode)1, (base::internal::RepeatMode)1> const&) at /b/c/builder/linux/src/out/Release/../../base/test/launcher/unit_test_launcher.cc:462
#12: main at /b/c/builder/linux/src/out/Release/../../base/test/run_all_base_unittests.cc:22
#13: ?? ??:0
----- end symbolized stack
 
output_file_ doesn't obviously look like it can be null or closed, unless maybe that XmlUnitTestResultPrinter object has already been destroyed.]

...

Ah, based on #6/#7 in the stack trace it could be that https://cs.chromium.org/chromium/src/base/test/gtest_xml_unittest_result_printer.cc?rcl=fb7942c7ff5bb9b0282483a5159c86537a2cfeb9&l=34 is failing the OpenFile, which causes a CHECK here https://cs.chromium.org/chromium/src/base/test/test_suite.cc?rcl=44851965d0862ce66a1cad576686125ef8864d25&l=242 which causes a Log, which then tries to output to the file that just failed to open, causing a crash.
Introducing a fake failure at that OpenFile() does cause the same crash. But I don't know why the OpenFile would be failing.
Summary: Failure in OpenFile() in test harness later causes crash in fflush() on Fuchsia (was: Crash in fflush() in test harness on Fuchsia)
I poked around in the output file/temp file creation, and it seems like that is this crash, but I can't find any reason why the file open would be failing or racing.

mkdtemp looks like it uses the realtime clock for seeding __randname, and tries 100 times before just returning XXXXXX of an unfilled template. It could be possible qemu's clock is very quantized in some cases or something, causing mkdtemp to not return a unique dir. But even then, it looks like that would log an error in CreateTemporaryDirInDirImpl() rather than failing later. So I don't think that's it.

In the parent process, the directory for the file is created, and that's CHECKd (and doesn't seem to fail).

So that leaves us with a failure to create a plain file (test_results.xml) by the child in a directory the parent successfully created. So that's kinda weird.
I got this locally with an alias of the filename onto the stack (it's hard to log anything from the child process at this point)

"/tmp/.org.chromium.Chromium.AgIFGF/test_results.xml" seems pretty as-expected.

[00003.494] 01123.01174> <== fatal exception: process [60190] thread [60221]
[00003.494] 01123.01174> <== fatal page fault, PC at 0x6cdddf31a5e0
[00003.494] 01123.01174>  CS:                   0 RIP:     0x6cdddf31a5e0 EFL:            0x10246 CR2:               0x20
[00003.494] 01123.01174>  RAX:     0x34697ed52570 RBX:     0x24d09a00c780 RCX:     0x34697ed525a0 RDX:                  0
[00003.494] 01123.01174>  RSI:     0x24d09a00c7b3 RDI:     0x34697ed52570 RBP:     0x4a7ba8d38840 RSP:     0x34697ed52570
[00003.494] 01123.01174>   R8: 0x8080808080808080  R9: 0xfefefefefefefeff R10:  0x10001000f000300 R11:     0x24d09a00c7b3
[00003.494] 01123.01174>  R12:     0x34697ed52857 R13:     0x34697ed52840 R14:     0x34697ed52570 R15:     0x3c85f9d5cbe0
[00003.494] 01123.01174>  errc:               0x6
[00003.494] 01123.01174> bottom of user stack:
[00003.494] 01123.01174> 0x000034697ed52570: 706d742f 726f2e2f 68632e67 696d6f72 |/tmp/.org.chromi|
[00003.494] 01123.01174> 0x000034697ed52580: 432e6d75 6d6f7268 2e6d7569 46496741 |um.Chromium.AgIF|
[00003.494] 01123.01174> 0x000034697ed52590: 742f4647 5f747365 75736572 2e73746c |GF/test_results.|
[00003.494] 01123.01174> 0x000034697ed525a0: 006c6d78 00003469 df308d0d 00006cdd |xml.i4....0..l..|
[00003.494] 01123.01174> 0x000034697ed525b0: 9a00c780 000024d0 7ed52610 00003469 |.....$...&.~i4..|
[00003.494] 01123.01174> 0x000034697ed525c0: 7ed52857 00003469 7ed52840 00003469 |W(.~i4..@(.~i4..|
[00003.494] 01123.01174> 0x000034697ed525d0: 7ed52f70 00003469 d7c34aae 00000a13 |p/.~i4...J......|
[00003.494] 01123.01174> 0x000034697ed525e0: f9d5cbe0 00003c85 424cbd03 000053e1 |.....<....LB.S..|
[00003.494] 01123.01174> 0x000034697ed525f0: 9a00c700 000024d0 9a00c780 000024d0 |.....$.......$..|
[00003.494] 01123.01174> 0x000034697ed52600: a8d38840 00004a7b d7c3632f 00000a13 |@...{J../c......|
[00003.494] 01123.01174> 0x000034697ed52610: 7ed526d8 00003469 9a00c740 000024d0 |.&.~i4..@....$..|
[00003.494] 01123.01174> 0x000034697ed52620: 7ed526d8 00003469 7ed526d8 00003469 |.&.~i4...&.~i4..|
[00003.494] 01123.01174> 0x000034697ed52630: 7ed52857 00003469 7ed52840 00003469 |W(.~i4..@(.~i4..|
[00003.494] 01123.01174> 0x000034697ed52640: 7ed526d8 00003469 98a037ed 00007841 |.&.~i4...7..Ax..|
[00003.494] 01123.01174> 0x000034697ed52650: 7ed526d8 00003469 9a00c740 000024d0 |.&.~i4..@....$..|
[00003.494] 01123.01174> 0x000034697ed52660: a8d38840 00004a7b 00000010 00000000 |@...{J..........|
[00003.494] 01123.01174> arch: x86_64
[00003.495] 01123.01174> dso: id=765f3f6ec08e043f base=0x78419894f000 name=libc++.so.2
[00003.495] 01123.01174> dso: id=05a640477f586007 base=0x6df7b4675000 name=libc++abi.so.1
[00003.495] 01123.01174> dso: id=ee3d7ae5d9f4f31d base=0x6cddde6e8000 name=app:/system/base_unittests
[00003.495] 01123.01174> dso: id=2b6882f235d28d75ef453da68f39774ef99c8388 base=0x60a5a29b4000 name=<vDSO>
[00003.495] 01123.01174> dso: id=6682d527b69ec892f730643f7ff711e8f78f4116 base=0x605325895000 name=liblaunchpad.so
[00003.495] 01123.01174> dso: id=7232e5e96679f46409d5b78b2b94bccadd2de16a base=0x53e142455000 name=libc.so
[00003.495] 01123.01174> dso: id=aeae5a3cc3b99502 base=0x3e965d972000 name=libunwind.so.1
[00003.495] 01123.01174> dso: id=0ee2c1798ac9282426341de1bfb7f8d509010a73 base=0xa13d7c26000 name=libmxio.so
[00003.495] 01123.01174> bt#01: pc 0x6cdddf31a5e0 sp 0x34697ed52570 (app:/system/base_unittests,0xc325e0)
[00003.495] 01123.01174> bt#02: pc 0x6cdddf319a07 sp 0x34697ed526c0 (app:/system/base_unittests,0xc31a07)
[00003.495] 01123.01174> bt#03: pc 0x6cdddf319fa9 sp 0x34697ed52840 (app:/system/base_unittests,0xc31fa9)
[00003.495] 01123.01174> bt#04: pc 0x6cdddf31963f sp 0x34697ed529e0 (app:/system/base_unittests,0xc3163f)
[00003.495] 01123.01174> bt#05: pc 0x6cdddf325442 sp 0x34697ed52a20 (app:/system/base_unittests,0xc3d442)
[00003.495] 01123.01174> bt#06: pc 0x6cdddf3252aa sp 0x34697ed52f20 (app:/system/base_unittests,0xc3d2aa)
[00003.495] 01123.01174> bt#07: pc 0x6cdddf30bc34 sp 0x34697ed52f60 (app:/system/base_unittests,0xc23c34)
[00003.495] 01123.01174> bt#08: pc 0x53e142470fed sp 0x34697ed52ff0 (libc.so,0x1bfed)
[00003.495] 01123.01174> bt#09: pc 0 sp 0x34697ed53000
[00003.495] 01123.01174> bt#10: end
----- start symbolized stack
#01: base::XmlUnitTestResultPrinter::Initialize(base::FilePath const&) at /work/cr/src/out/fuch/../../base/test/gtest_xml_unittest_result_printer.cc:43
#02: base::TestSuite::AddTestLauncherResultPrinter() at /work/cr/src/out/fuch/../../base/test/test_suite.cc:242
#03: base::TestSuite::Initialize() at /work/cr/src/out/fuch/../../base/test/test_suite.cc:406
#04: base::TestSuite::Run() at /work/cr/src/out/fuch/../../base/test/test_suite.cc:261
#05: base::Callback<int (), (base::internal::CopyMode)1, (base::internal::RepeatMode)1>::Run() const & at /work/cr/src/out/fuch/../../base/callback.h:80
      (inlined by) base::(anonymous namespace)::LaunchUnitTestsInternal(base::Callback<int (), (base::internal::CopyMode)1, (base::internal::RepeatMode)1> const&, unsigned long, int, bool, base::Callback<void (), (base::internal::CopyMode)1, (base::internal::RepeatMode)1> const&) at /work/cr/src/out/fuch/../../base/test/launcher/unit_test_launcher.cc:216
#06: base::LaunchUnitTests(int, char**, base::Callback<int (), (base::internal::CopyMode)1, (base::internal::RepeatMode)1> const&) at /work/cr/src/out/fuch/../../base/test/launcher/unit_test_launcher.cc:462
#07: main at /work/cr/src/out/fuch/../../base/test/run_all_base_unittests.cc:22
#08: ?? ??:0
----- end symbolized stack
It seems that the parent directory doesn't exist ("dir:false" aliased to stack), which is presumably why the open file fails.

Some code to check at that point: https://chromium-review.googlesource.com/c/549178/

So, dir:false means DirectoryExists() returns false, stat() returns non-zero, and errno is 2, which seems pretty conclusive.

However, the log+CHECK in the parent process in base/test/launcher/unit_test_launcher.cc show the directory creation succeeding. So that's a thing.

Maybe fs actions aren't guaranteed to be visible between parent/child?
 

[00003.545] 01123.01174> <== fatal exception: process [60069] thread [60100]
[00003.545] 01123.01174> <== fatal page fault, PC at 0x600172fc06a6
[00003.545] 01123.01174>  CS:                   0 RIP:     0x600172fc06a6 EFL:            0x10202 CR2:               0x20
[00003.545] 01123.01174>  RAX:     0x65bb6a7334e0 RBX:                  0 RCX: 0xff2b312b796d2b64 RDX:                  0
[00003.545] 01123.01174>  RSI:      0xdd080ffb7b3 RDI:     0x65bb6a7334e0 RBP:     0x65bb6a7334c0 RSP:     0x65bb6a7334c0
[00003.545] 01123.01174>   R8: 0xfefefefefefefeff  R9:                  0 R10:                0x2 R11:                  0
[00003.545] 01123.01174>  R12:     0x65bb6a733857 R13:     0x65bb6a733840 R14:     0x65bb6a7334e0 R15:      0xdd080ffb780
[00003.545] 01123.01174>  errc:               0x6
[00003.545] 01123.01174> bottom of user stack:
[00003.545] 01123.01174> 0x000065bb6a7334c0: 80002c32 00000dd0 00000022 00000000 |2,......".......|
[00003.545] 01123.01174> 0x000065bb6a7334d0: 00000040 80000000 af78a7ae 000056a9 |@.........x..V..|
[00003.545] 01123.01174> 0x000065bb6a7334e0: 3a726964 736c6166 7a6e2c65 2f2c322c |dir:false,nz,2,/|
[00003.545] 01123.01174> 0x000065bb6a7334f0: 2f706d74 67726f2e 7268632e 75696d6f |tmp/.org.chromiu|
[00003.545] 01123.01174> 0x000065bb6a733500: 68432e6d 696d6f72 6e2e6d75 4561436e |m.Chromium.nnCaE|
[00003.545] 01123.01174> 0x000065bb6a733510: 65742f44 725f7473 6c757365 782e7374 |D/test_results.x|
[00003.545] 01123.01174> 0x000065bb6a733520: 6a006c6d 000065bb 00000014 00000000 |ml.j.e..........|
[00003.545] 01123.01174> 0x000065bb6a733530: 74736574 75616c2d 6568636e 756f2d72 |test-launcher-ou|
[00003.545] 01123.01174> 0x000065bb6a733540: 74757074 14000000 b72743a9 6292701b |tput.....C'..p.b|
[00003.545] 01123.01174> 0x000065bb6a733550: 6a7336d8 000065bb 6a733f70 000065bb |.6sj.e..p?sj.e..|
[00003.545] 01123.01174> 0x000065bb6a733560: 42fd1be0 00005c42 80ffb740 00000dd0 |...BB\..@.......|
[00003.545] 01123.01174> 0x000065bb6a733570: 00000022 00000000 00000040 80000000 |".......@.......|
[00003.545] 01123.01174> 0x000065bb6a733580: 6a733601 000065bb af78a7ae 000056a9 |.6sj.e....x..V..|
[00003.545] 01123.01174> 0x000065bb6a733590: 6a73359d 000065bb 00000028 5f6e6500 |.5sj.e..(....en_|
[00003.545] 01123.01174> 0x000065bb6a7335a0: 6a005355 000065bb 72faed0d 00006001 |US.j.e.....r.`..|
[00003.545] 01123.01174> 0x000065bb6a7335b0: 80ffb780 00000dd0 6a733610 000065bb |.........6sj.e..|
[00003.545] 01123.01174> arch: x86_64
[00003.545] 01123.01174> dso: id=0ee2c1798ac9282426341de1bfb7f8d509010a73 base=0x793ee0ae9000 name=libmxio.so
[00003.545] 01123.01174> dso: id=2b6882f235d28d75ef453da68f39774ef99c8388 base=0x6df808dd1000 name=<vDSO>
[00003.545] 01123.01174> dso: id=9145014ade6fe486 base=0x60017238e000 name=app:/system/base_unittests
[00003.545] 01123.01174> dso: id=7232e5e96679f46409d5b78b2b94bccadd2de16a base=0x56a9af717000 name=libc.so
[00003.545] 01123.01174> dso: id=6682d527b69ec892f730643f7ff711e8f78f4116 base=0x50a483858000 name=liblaunchpad.so
[00003.545] 01123.01174> dso: id=05a640477f586007 base=0x43e26306f000 name=libc++abi.so.1
[00003.545] 01123.01174> dso: id=765f3f6ec08e043f base=0x2e291eab8000 name=libc++.so.2
[00003.545] 01123.01174> dso: id=aeae5a3cc3b99502 base=0x7b02d780000 name=libunwind.so.1
[00003.547] 01123.01174> bt#01: pc 0x600172fc06a6 sp 0x65bb6a7334c0 (app:/system/base_unittests,0xc326a6)
[00003.548] 01123.01174> bt#02: pc 0x600172fbfa07 sp 0x65bb6a7336c0 (app:/system/base_unittests,0xc31a07)
[00003.548] 01123.01174> bt#03: pc 0x600172fbffa9 sp 0x65bb6a733840 (app:/system/base_unittests,0xc31fa9)
[00003.548] 01123.01174> bt#04: pc 0x600172fbf63f sp 0x65bb6a7339e0 (app:/system/base_unittests,0xc3163f)
[00003.548] 01123.01174> bt#05: pc 0x600172fcb502 sp 0x65bb6a733a20 (app:/system/base_unittests,0xc3d502)
[00003.548] 01123.01174> bt#06: pc 0x600172fcb36a sp 0x65bb6a733f20 (app:/system/base_unittests,0xc3d36a)
[00003.548] 01123.01174> bt#07: pc 0x600172fb1c34 sp 0x65bb6a733f60 (app:/system/base_unittests,0xc23c34)
[00003.548] 01123.01174> bt#08: pc 0x56a9af732fed sp 0x65bb6a733ff0 (libc.so,0x1bfed)
[00003.548] 01123.01174> bt#09: pc 0 sp 0x65bb6a734000
[00003.548] 01123.01174> bt#10: end
----- start symbolized stack
#01: base::XmlUnitTestResultPrinter::Initialize(base::FilePath const&) at /work/cr/src/out/fuch/../../base/test/gtest_xml_unittest_result_printer.cc:58
#02: base::TestSuite::AddTestLauncherResultPrinter() at /work/cr/src/out/fuch/../../base/test/test_suite.cc:242
#03: base::TestSuite::Initialize() at /work/cr/src/out/fuch/../../base/test/test_suite.cc:406
#04: base::TestSuite::Run() at /work/cr/src/out/fuch/../../base/test/test_suite.cc:261
#05: base::Callback<int (), (base::internal::CopyMode)1, (base::internal::RepeatMode)1>::Run() const & at /work/cr/src/out/fuch/../../base/callback.h:80
      (inlined by) base::(anonymous namespace)::LaunchUnitTestsInternal(base::Callback<int (), (base::internal::CopyMode)1, (base::internal::RepeatMode)1> const&, unsigned long, int, bool, base::Callback<void (), (base::internal::CopyMode)1, (base::internal::RepeatMode)1> const&) at /work/cr/src/out/fuch/../../base/test/launcher/unit_test_launcher.cc:216
#06: base::LaunchUnitTests(int, char**, base::Callback<int (), (base::internal::CopyMode)1, (base::internal::RepeatMode)1> const&) at /work/cr/src/out/fuch/../../base/test/launcher/unit_test_launcher.cc:462
#07: main at /work/cr/src/out/fuch/../../base/test/run_all_base_unittests.cc:22
#08: ?? ??:0
----- end symbolized stack


Owner: scottmg@chromium.org
Status: Started (was: Untriaged)
Alright! I finally figured this one out fully. There's a combination of problems.

There's some racy behaviour in processes that create child processes. In, for example https://cs.chromium.org/chromium/src/base/process/process_unittest.cc?l=52 there's no wait for the child process to terminate (but this happens in a bunch of tests). Thsoe are invoking another copy of the test binary, so it goes through all the usual setup and attempt to start logging to a file. However, the main runner is the one that creates the temporary output directory for logging, and removes it when the test is "done".

So, sometimes that un-waited-on child can not even start up before the directory is deleted by the main process. (This seems to happen more freqently on Fuchsia).

When that happens, base::XmlUnitTestResultPrinter::Initialize() fails to be able to create its output directory.

And when that happens, base::TestSuite::AddTestLauncherResultPrinter CHECKs. Somewhat circularly, that causes a log message, which attempts to use the not-correctly-initialized XmlUnitTestResultPrinter, which now has a null FILE* output_file_.

On (all?) other platforms the libc is tolerant of FILE* being NULL when passed to fprintf/fflush, etc. However, on Fuchsia, it (fairly reasonably) crashes.

Whew!
(oh, and to repro, it happened on Fuchsia mostly only when the spawn-a-child test happened to be the last test in the batch of 10, which made it extra tricky to figure out when it was happening)
Project Member

Comment 8 by bugdroid1@chromium.org, Jun 27 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/66411becd2b5a570b5629b64422a17a3583b2d0e

commit 66411becd2b5a570b5629b64422a17a3583b2d0e
Author: Scott Graham <scottmg@chromium.org>
Date: Tue Jun 27 19:21:27 2017

Make XmlUnitTestResultPrinter handle failing to open output

This was causing occasional crashes on Fuchsia. See
https://bugs.chromium.org/p/chromium/issues/detail?id=736783#c6 for a
detailed explanation of the conditions under which this can happen.

Bug:  736783 , 706592
Change-Id: I5000509d846dd9b8f10b9ae7a58bbfbdd642cc96
Reviewed-on: https://chromium-review.googlesource.com/550200
Reviewed-by: Paweł Hajdan Jr. <phajdan.jr@chromium.org>
Commit-Queue: Scott Graham <scottmg@chromium.org>
Cr-Commit-Position: refs/heads/master@{#482709}
[modify] https://crrev.com/66411becd2b5a570b5629b64422a17a3583b2d0e/base/test/gtest_xml_unittest_result_printer.cc
[modify] https://crrev.com/66411becd2b5a570b5629b64422a17a3583b2d0e/base/test/gtest_xml_unittest_result_printer.h

Status: Fixed (was: Started)

Sign in to add a comment