NestedMessagePumpAndroid's Quit() logic is racy [Flaking RenderProcessHostTest.FetchKeepAliveRendererProcess_Hung] |
|||||||
Issue descriptionOS: Android Test Bot: Marshmallow 64 bit Tester Test Suite: viz_content_browsertests Test: RenderProcessHostTest.FetchKeepAliveRendererProcess_Hung Example Failing Run: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Marshmallow%2064%20bit%20Tester/23130 Stack Trace: [FATAL:lock_impl_posix.cc(80)] Check failed: rv == 0 (16 vs. 0). Device or resource busy. [ERROR:test_suite.cc(303)] Currently running: RenderProcessHostTest.FetchKeepAliveRendererProcess_Hung Searching for native crashes in: /b/swarming/w/itVbcpYJ/tmp18mzxK Unknown Android release, consider passing --packed-lib. Reading Android symbols from: /b/swarming/w/ir Searching for Chrome symbols from within: /b/swarming/w/ir/out/Debug/lib.unstripped:/b/swarming/w/ir/out/Debug Stack Trace: RELADDR FUNCTION FILE:LINE 0000000004edf63f logging::LogMessage::~LogMessage() ../../base/logging.cc:599:29 0000000004f6ad37 base::internal::LockImpl::~LockImpl() ../../base/synchronization/lock_impl_posix.cc:80:3 0000000004f16787 base::Lock::~Lock() ../../base/synchronization/lock.cc:20:1 0000000004f6b9b7 base::WaitableEvent::WaitableEventKernel::~WaitableEventKernel() ../../base/synchronization/waitable_event_posix.cc:377:58 0000000004f6ba9b void base::RefCountedThreadSafe<base::WaitableEvent::WaitableEventKernel, base::DefaultRefCountedThreadSafeTraits<base::WaitableEvent::WaitableEventKernel> >::DeleteInternal<base::WaitableEvent::WaitableEventKernel>(base::WaitableEvent::WaitableEventKernel const*) ../../base/memory/ref_counted.h:400:5 v------> content::NestedMessagePumpAndroid::RunState::~RunState() ../../content/public/test/nested_message_pump_android.cc:26:34 0000000004bf9f3f content::NestedMessagePumpAndroid::Run(base::MessagePump::Delegate*) ../../content/public/test/nested_message_pump_android.cc:118:0 v------> std::__ndk1::unique_ptr<base::MessagePump, std::__ndk1::default_delete<base::MessagePump> >::operator->() const ../../third_party/android_ndk/sources/cxx-stl/llvm-libc++/include/memory:2515:19 0000000004ee5c73 base::MessageLoop::Run(bool) ../../base/message_loop/message_loop.cc:383:0 0000000004f04443 base::RunLoop::Run() ../../base/run_loop.cc:102:14 0000000004f04807 base::RunLoop::RunUntilIdle() ../../base/run_loop.cc:115:3 0000000002b5e4c3 content::(anonymous namespace)::RenderProcessHostTest::WaitUntilProcessExits(int) ../../content/browser/renderer_host/render_process_host_browsertest.cc:91:23 0000000002b5e983 content::(anonymous namespace)::RenderProcessHostTest_FetchKeepAliveRendererProcess_Hung_Test::RunTestOnMainThread() ../../content/browser/renderer_host/render_process_host_browsertest.cc:1038:3 0000000004c0bb57 content::BrowserTestBase::ProxyRunTestOnMainThreadLoop() ../../content/public/test/browser_test_base.cc:406:5 00000000029d3ff7 void base::internal::Invoker<base::internal::BindState<void (content::GenerateMHTMLAndExitRendererMessageFilter::*)(), base::internal::UnretainedWrapper<content::GenerateMHTMLAndExitRendererMessageFilter> >, void ()>::RunImpl<void (content::GenerateMHTMLAndExitRendererMessageFilter::*)(), std::__ndk1::tuple<base::internal::UnretainedWrapper<content::GenerateMHTMLAndExitRendererMessageFilter> >, 0ul>(void (content::GenerateMHTMLAndExitRendererMessageFilter::*&&)(), std::__ndk1::tuple<base::internal::UnretainedWrapper<content::GenerateMHTMLAndExitRendererMessageFilter> >&&, std::__ndk1::integer_sequence<unsigned long, 0ul>) ../../base/bind_internal.h:689:12 0000000004cc2d63 content::ShellBrowserMainParts::PreMainMessageLoopRun() ../../content/shell/browser/shell_browser_main_parts.cc:199:26 00000000036d1d3f content::BrowserMainLoop::PreMainMessageLoopRun() ../../content/browser/browser_main_loop.cc:1017:13 00000000029d3ff7 void base::internal::Invoker<base::internal::BindState<void (content::GenerateMHTMLAndExitRendererMessageFilter::*)(), base::internal::UnretainedWrapper<content::GenerateMHTMLAndExitRendererMessageFilter> >, void ()>::RunImpl<void (content::GenerateMHTMLAndExitRendererMessageFilter::*)(), std::__ndk1::tuple<base::internal::UnretainedWrapper<content::GenerateMHTMLAndExitRendererMessageFilter> >, 0ul>(void (content::GenerateMHTMLAndExitRendererMessageFilter::*&&)(), std::__ndk1::tuple<base::internal::UnretainedWrapper<content::GenerateMHTMLAndExitRendererMessageFilter> >&&, std::__ndk1::integer_sequence<unsigned long, 0ul>) ../../base/bind_internal.h:689:12 0000000003a0facf content::StartupTaskRunner::RunAllTasksNow() ../../content/browser/startup_task_runner.cc:43:18 00000000036d128f content::BrowserMainLoop::CreateStartupTasks() ../../content/browser/browser_main_loop.cc:923:27 00000000036d3607 content::BrowserMainRunnerImpl::Initialize(content::MainFunctionParams const&) ../../content/browser/browser_main_runner_impl.cc:141:15 00000000036cfda7 content::BrowserMain(content::MainFunctionParams const&) ../../content/browser/browser_main.cc:43:32 0000000004c0b927 content::BrowserTestBase::SetUp() ../../content/public/test/browser_test_base.cc:317:3 0000000004bf6c53 content::ContentBrowserTest::SetUp() ../../content/public/test/content_browser_test.cc:104:20 00000000032ce6a3 testing::Test::Run() ../../third_party/googletest/src/googletest/src/gtest.cc:2487:3 00000000032cece3 testing::TestInfo::Run() ../../third_party/googletest/src/googletest/src/gtest.cc:2667:11 00000000032cefbf testing::TestCase::Run() ../../third_party/googletest/src/googletest/src/gtest.cc:2785:28 00000000032d37cb testing::internal::UnitTestImpl::RunAllTests() ../../third_party/googletest/src/googletest/src/gtest.cc:5047:43 00000000032d356b testing::UnitTest::Run() ../../third_party/googletest/src/googletest/src/gtest.cc:4663:10 0000000004c710ff base::TestSuite::Run() ../../base/test/test_suite.cc:277:16 0000000004bf99af content::ContentTestLauncherDelegate::RunTestSuite(int, char**) ../../content/test/content_test_launcher.cc:108:48 0000000004c394b7 content::LaunchTests(content::TestLauncherDelegate*, unsigned long, int, char**) ../../content/public/test/test_launcher.cc:645:31 0000000004bf9973 main ../../content/test/content_test_launcher.cc:138:10 v------> testing::android::JNI_NativeTest_RunTests(_JNIEnv*, base::android::JavaParamRef<_jobject*> const&, base::android::JavaParamRef<_jstring*> const&, base::android::JavaParamRef<_jstring*> const&, base::android::JavaParamRef<_jstring*> const&, base::android::JavaParamRef<_jobject*> const&, base::android::JavaParamRef<_jstring*> const&) ../../testing/android/native_test/native_test_launcher.cc:131:3 0000000002d2f9e3 Java_org_chromium_native_1test_NativeTest_nativeRunTests gen/testing/android/native_test/native_test_jni_headers/testing/jni/NativeTest_jni.h:58:0 000000000128a0ab <UNKNOWN> /data/app/org.chromium.content_browsertests_apk-1/oat/arm64/base.odex 000000000128b0ab <UNKNOWN> /data/app/org.chromium.content_browsertests_apk-1/oat/arm64/base.odex 0000000001289f5b <UNKNOWN> /data/app/org.chromium.content_browsertests_apk-1/oat/arm64/base.odex 000000000128980f <UNKNOWN> /data/app/org.chromium.content_browsertests_apk-1/oat/arm64/base.odex 00000000029d06e3 <UNKNOWN> /data/dalvik-cache/arm64/system@framework@boot.oat [ RUN ] RenderProcessHostTest.FetchKeepAliveRendererProcess_Hung Hey Yutaka, I could use some help triaging an error I've seen from a test you wrote. It seems that the test is unable to release its lock turning the teardown of a WaitableEvent. I've only seen this when the test has ran with: --enable-features=VizDisplayCompositor Any insight would be appreciated. Thanks, Jonathan
,
Aug 9
,
Aug 17
Hmm, the crash happens in base::MessageLoop::Run(), and I don't see nothing interesting in the test code itself. Perhaps something is wrong in content::NestedMessagePumpAndroid? I'm cc-ing base owners and android experts.
,
Aug 17
Issue 875179 has been merged into this issue.
,
Aug 17
Disabling this for Android in the meantime: https://crrev.com/c/1179158
,
Aug 17
I think this is an issue in NestedMessagePumpAndroid. RunState::waitable_event is being deleted while it's busy (error 16 is "resource busy). IMO the issue is that NestedMessagePumpAndroid::ScheduleWork() is racing with NestedMessagePumpAndroid::Run() exiting if there's a task posted to its MessageLoop (resulting in ScheduleWork()) while it's quitting. It's a UAF to invoke waitable_event.Signal() after ~WaitableEvent (or in this case I guess it's a use-while-free..!) @mthiesse : can you have a look? Thanks!
,
Aug 17
P1 per likely causing other undiagnosed flakiness on the android bots (and we should re-enable that test once we figure this one out). Thanks!
,
Aug 17
,
Aug 21
Taking a look. |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by yhirano@chromium.org
, Aug 9