tricky-tot-chrome-pfq-informational desktopui_MashLogin is very flaky |
|||||
Issue descriptionbuild info: https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8942644889899480272 desktopui_MashLogin FAIL: Unhandled BrowserConnectionGoneException: Timed out while waiting 240s for _GetDevToolsClient. 10:25:28.911 WARNI|desktopui_MashLogi:0024| Unable to capture screenshot. Unable to take screenshot. There may not be anything on the screen. 10:29:35.379 ERROR| browser:0054| Failed with BrowserConnectionGoneException while starting the browser backend. 10:29:35.945 WARNI| test:0637| The test failed with the following exception Traceback (most recent call last): File "/usr/local/autotest/common_lib/test.py", line 631, in _exec _call_test_function(self.execute, *p_args, **p_dargs) File "/usr/local/autotest/common_lib/test.py", line 837, in _call_test_function raise error.UnhandledTestFail(e) UnhandledTestFail: Unhandled BrowserConnectionGoneException: Timed out while waiting 240s for _GetDevToolsClient. Found Minidump: False
,
Jun 26 2018
There is crash while running the test, see https://stainless.corp.google.com/browse/chromeos-autotest-results/212050582-chromeos-test/: 0 chrome!ui::DrmThreadProxy::AddBindingDrmDevice(mojo::InterfaceRequest<ui::ozone::mojom::DrmDevice>) [thread.h : 233 + 0x0] 1 chrome!ui::(anonymous namespace)::OzonePlatformGbm::CreateDrmDeviceBinding(mojo::InterfaceRequest<ui::ozone::mojom::DrmDevice>, service_manager::BindSourceInfo const&) [ozone_platform_gbm.cc : 142 + 0x5] 2 chrome!base::internal::Invoker<base::internal::BindState<void (SpellCheck::*)(mojo::InterfaceRequest<spellcheck::mojom::SpellChecker>), base::WeakPtr<SpellCheck> >, void (mojo::InterfaceRequest<spellcheck::mojom::SpellChecker>)>::Run(base::internal::BindStateBase*, mojo::InterfaceRequest<spellcheck::mojom::SpellChecker>&&) [bind_internal.h : 507 + 0x2] 3 chrome!content::ServiceManagerConnectionImpl::IOThreadContext::WrapServiceRequestHandlerNoPID(base::RepeatingCallback<void (mojo::InterfaceRequest<service_manager::mojom::Service>)> const&, mojo::InterfaceRequest<service_manager::mojom::Service>, mojo::InterfacePtr<service_manager::mojom::PIDReceiver>) [callback.h : 129 + 0x3] 4 chrome!base::internal::Invoker<base::internal::BindState<void (*)(base::RepeatingCallback<void (mojo::InterfaceRequest<catalog::mojom::Catalog>, service_manager::BindSourceInfo const&)> const&, mojo::InterfaceRequest<catalog::mojom::Catalog>, service_manager::BindSourceInfo const&), base::RepeatingCallback<void (mojo::InterfaceRequest<catalog::mojom::Catalog>, service_manager::BindSourceInfo const&)>, mojo::InterfaceRequest<catalog::mojom::Catalog>, service_manager::BindSourceInfo>, void ()>::RunOnce(base::internal::BindStateBase*) [bind_internal.h : 407 + 0x2] 5 chrome!base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) [callback.h : 99 + 0x6] 6 chrome!base::MessageLoop::RunTask(base::PendingTask*) [incoming_task_queue.cc : 129 + 0xf] 7 chrome!base::MessageLoop::DoWork() [message_loop.cc : 329 + 0x8] 8 chrome!base::MessagePumpLibevent::Run(base::MessagePump::Delegate*) [message_pump_libevent.cc : 210 + 0x5] 9 chrome!<name omitted> [run_loop.cc : 102 + 0x8] 10 chrome!base::Thread::ThreadMain() [thread.cc : 337 + 0x6] 11 chrome!base::(anonymous namespace)::ThreadFunc(void*) [platform_thread_posix.cc : 76 + 0x5] 12 libpthread-2.23.so!start_thread [pthread_create.c : 333 + 0x11] 13 libc-2.23.so!clone + 0x6d
,
Jun 27 2018
spang@, do you have any idea about the crash stack in #2?
,
Jun 27 2018
Looking more deeply into the failure on tricky informational build, the failure is actually quite flaky (failed 3 out of 5 runs), and the first run it failed is https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8942644889899480272. Based on the callstack and git history, I suspect it might be caused by one of the recent ozone changes under ui/ozone/platform/drm/. spang@, I noticed that you landed several drm changes recently. Could you take a look?
,
Jun 27 2018
Rob owns these mojo bindings.
,
Jun 27 2018
I can repro this failure locally, i tried to run the tests on eve for 10 times, and the test failed 2 times with the same issue, FAIL: Unhandled BrowserConnectionGoneException: Timed out while waiting 240s for _GetDevToolsClient.
,
Jun 28 2018
I get some scary errors when I run it but they don't originating from our code
[17169:17277:0627/201125.655906:ERROR:message_pump_libevent.cc(181)] event_add failed(fd=100): Bad file descriptor (9)
[17169:17277:0627/201125.655949:ERROR:file_descriptor_watcher_posix.cc(118)] Failed to watch fd=100
and
[17294:17305:0627/201609.975831:FATAL:weak_ptr.cc(31)] Check failed: sequence_checker_.CalledOnValidSequence(). WeakPtrs must be checked on the same sequenced thread.
#0 0x00007aa93816eb6d in nanosleep () at ../sysdeps/unix/syscall-template.S:84
#1 0x00005d4018d1dc30 in base::PlatformThread::Sleep (duration=...) at ../../base/threading/platform_thread_posix.cc:187
#2 0x00005d4018c6b603 in base::debug::WaitForDebugger (wait_seconds=<optimized out>, silent=true) at ../../base/debug/debugger.cc:28
#3 0x00005d4018c87e3d in base::internal::WeakReference::Flag::IsValid (this=0x35a8301bf7d0) at ../../base/memory/weak_ptr.cc:29
#4 0x00005d401565d4d2 in base::WeakPtr<net::(anonymous namespace)::DnsHTTPAttempt>::get (this=0x7aa92e2dd2f0)
at ../../base/memory/weak_ptr.h:243
---Type <return> to continue, or q <return> to quit---
#5 base::WeakPtr<net::(anonymous namespace)::DnsHTTPAttempt>::operator bool (this=0x7aa92e2dd2f0) at ../../base/memory/weak_ptr.h:256
#6 base::CallbackCancellationTraits<void (net::(anonymous namespace)::DnsHTTPAttempt::*)(net::URLRequest*, int), std::__1::tuple<base::
WeakPtr<net::(anonymous namespace)::DnsHTTPAttempt>, net::URLRequest*, int>, void>::IsCancelled<base::WeakPtr<net::(anonymous namespace)
::DnsHTTPAttempt>, net::URLRequest*, int> (receiver=...) at ../../base/bind_internal.h:937
#7 base::internal::ApplyCancellationTraitsImpl<void (net::(anonymous namespace)::DnsHTTPAttempt::*)(net::URLRequest*, int), std::__1::t
uple<base::WeakPtr<net::(anonymous namespace)::DnsHTTPAttempt>, net::URLRequest*, int>, 0ul, 1ul, 2ul> (bound_args=...,
functor=<optimized out>) at ../../base/bind_internal.h:734
#8 base::internal::ApplyCancellationTraits<base::internal::BindState<void (net::(anonymous namespace)::DnsHTTPAttempt::*)(net::URLReque
st*, int), base::WeakPtr<net::(anonymous namespace)::DnsHTTPAttempt>, net::URLRequest*, int> > (base=0x35a830160000)
at ../../base/bind_internal.h:745
#9 0x00005d4018c666a8 in base::internal::BindStateBase::IsCancelled (this=0x7aa92e2dd2f0) at ../../base/callback_internal.h:89
#10 base::internal::CallbackBase::IsCancelled (this=0x7aa92e2dd840) at ../../base/callback_internal.cc:61
#11 0x00005d4018c8a001 in base::MessageLoop::DoWork (this=0x35a8300da8c0) at ../../base/message_loop/message_loop.cc:361
#12 0x00005d4018c8c296 in base::MessagePumpDefault::Run (this=0x35a83016cfc0, delegate=0x35a8300da8c0)
at ../../base/message_loop/message_pump_default.cc:37
#13 0x00005d4018c893e1 in base::MessageLoop::Run (this=0x35a8300da8c0, application_tasks_allowed=true)
at ../../base/message_loop/message_loop.cc:271
#14 0x00005d4018cb4c66 in base::RunLoop::Run (this=0x7aa92e2ddd70) at ../../base/run_loop.cc:102
#15 0x00005d4018cdf78a in base::Thread::Run (this=<optimized out>, run_loop=0x7aa92e2ddd70) at ../../base/threading/thread.cc:255
#16 0x00005d4018cdfade in base::Thread::ThreadMain (this=0x35a830165b90) at ../../base/threading/thread.cc:337
#17 0x00005d4018d1e28f in base::(anonymous namespace)::ThreadFunc (params=0x35a830054790)
at ../../base/threading/platform_thread_posix.cc:76
#18 0x00007aa9381652b8 in start_thread (arg=0x7aa92e2de700) at pthread_create.c:333
#19 0x00007aa937618fad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
,
Jun 28 2018
sudo test_that --board=eve --results_dir=* (IP of the test device) desktopui_MashLogin --iterations=10 Above is what i used to run the test locally insider chroot. The error info that i can see 11:44:25 INFO | autoserv| START desktopui_MashLogin desktopui_MashLogin timestamp=1530211464 localtime=Jun 28 11:44:24 11:44:25 INFO | autoserv| Bundling /build/eve/usr/local/build/autotest/client/site_tests/desktopui_MashLogin into test-desktopui_MashLogin.tar.bz2 11:48:37 INFO | autoserv| FAIL desktopui_MashLogin desktopui_MashLogin timestamp=1530211716 localtime=Jun 28 11:48:36 Unhandled BrowserConnectionGoneException: Timed out while waiting 240s for _GetDevToolsClient. 11:48:37 INFO | autoserv| END FAIL desktopui_MashLogin desktopui_MashLogin timestamp=1530211716 localtime=Jun 28 11:48:36
,
Jun 28 2018
Tried to reset to https://chromium.googlesource.com/chromium/src/+/bb1059e537e6de55913f7c526f01fc46b4a5642d And re-ran the test, it is still flaky, failed 18/20.
,
Jun 29 2018
I didn't see the flaky for the past 2 days. Will keep monitoring to see if we can close this one.
,
Jun 29 2018
To be more precisely, I didn't see the flaky since 6/28.
,
Jul 13
Seems closeable yes? Mash is deprecated. OopAsh is the future. |
|||||
►
Sign in to add a comment |
|||||
Comment 1 by x...@chromium.org
, Jun 26 2018