New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 680700 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Closed: Jan 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug

Blocking:
issue 680398



Sign in to add a comment

Mustash ash process is crash looping on startup

Project Member Reported by jamescook@chromium.org, Jan 12 2017

Issue description

We're seeing chrome PFQ informational failures on peach_pit. I'm seeing something similar on my cave device with ToT chrome.

From peach_pit, here's crashinfo.tgz /var/log/ui/ui.LATEST

[13516:13518:0111/211547.816036:600135244:ERROR:service_manager.cc(997)] Error: The catalog was unable to read a manifest for service "tracing".
[13516:13518:0111/211547.889265:600208130:ERROR:service_manager.cc(997)] Error: The catalog was unable to read a manifest for service "tracing".
[13536:13536:0111/211547.890658:600209706:VERBOSE1:drm_device_handle.cc(83)] Succeeded authenticating /dev/dri/card0 in 0 ms with 1 attempt(s)
[13536:13536:0111/211547.891378:600210231:VERBOSE1:drm_util.cc(204)] Unable to get cursor width capability: Invalid argument
[13536:13536:0111/211547.891506:600210358:VERBOSE1:drm_util.cc(337)] Failed to get EDID blob for connector 20
[13536:13536:0111/211547.917092:600235947:VERBOSE1:drm_display_host_manager.cc(242)] Got display event ADD for /dev/dri/card0
[13536:13545:0111/211547.996626:600315482:VERBOSE1:drm_util.cc(204)] Unable to get cursor width capability: Invalid argument
[13536:13545:0111/211547.996745:600315598:VERBOSE1:drm_util.cc(337)] Failed to get EDID blob for connector 20
[13536:13536:0111/211547.996920:600315774:VERBOSE1:display_configurator.cc(951)] Display snapshots invalidated.
[13536:13536:0111/211547.997022:600315874:VERBOSE1:update_display_configuration_task.cc(64)] OnDisplaysUpdated: new_display_state=SINGLE new_power_state=ALL_ON flags=1 force_configure=1 display_count=1
[13536:13536:0111/211547.997135:600315988:VERBOSE1:display_configurator.cc(216)] EnterState: display=SINGLE power=ALL_ON
[13536:13545:0111/211547.997269:600316121:VERBOSE1:drm_display.cc(102)] DRM configuring: device=/sys/devices/platform/exynos-drm/drm/card0 crtc=27 connector=20 origin=0,0 size=1366x768
[13524:13524:0111/211548.120319:600439174:ERROR:wm_shell_mus.cc(320)] Not implemented reached in virtual void ash::mus::WmShellMus::RecordUserMetricsAction(ash::UserMetricsAction)
[13524:13524:0111/211548.120977:600439830:ERROR:shell_delegate_mus.cc(168)] Not implemented reached in virtual std::unique_ptr<ash::PaletteDelegate> ash::ShellDelegateMus::CreatePaletteDelegate()
[13524:13524:0111/211548.121103:600439956:ERROR:wm_shell_mus.cc(396)] Not implemented reached in virtual void ash::mus::WmShellMus::AddDisplayObserver(ash::WmDisplayObserver*)
terminate called after throwing an instance of 'std::out_of_range'
  what():  vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)
[13536:13536:0111/211548.165681:600484539:VERBOSE1:display_configurator.cc(1062)] OnConfigured: success=1 new_display_state=SINGLE new_power_state=ALL_ON
[13536:13536:0111/211548.165980:600484834:ERROR:screen_manager_ozone.cc(337)] Not implemented reached in virtual void display::ScreenManagerOzone::CloseMirroringDisplayIfNotNecessary()
[13536:13545:0111/211548.168337:600487192:VERBOSE1:drm_util.cc(204)] Unable to get cursor width capability: Invalid argument
[13536:13542:0111/211548.210245:600529102:ERROR:texture_manager.cc(3224)] [GroupMarkerNotSet( crbug.com/242999 )!:68B8C1B8]GL ERROR :GL_INVALID_FRAMEBUFFER_OPERATION : glTexImage2D: <- error from previous GL command
[13521:13521:0111/211549.377026:601696119:ERROR:session.cc(22)] Restarting service: ash
[13664:13664:0111/211549.892939:602211795:ERROR:wm_shell_mus.cc(320)] Not implemented reached in virtual void ash::mus::WmShellMus::RecordUserMetricsAction(ash::UserMetricsAction)
[13664:13664:0111/211549.894349:602213204:ERROR:shell_delegate_mus.cc(168)] Not implemented reached in virtual std::unique_ptr<ash::PaletteDelegate> ash::ShellDelegateMus::CreatePaletteDelegate()
[13664:13664:0111/211549.894496:602213349:ERROR:wm_shell_mus.cc(396)] Not implemented reached in virtual void ash::mus::WmShellMus::AddDisplayObserver(ash::WmDisplayObserver*)
terminate called after throwing an instance of 'std::out_of_range'
  what():  vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)
[13521:13521:0111/211550.729300:603048159:ERROR:session.cc(22)] Restarting service: ash
[13689:13689:0111/211551.643355:603962212:ERROR:wm_shell_mus.cc(320)] Not implemented reached in virtual void ash::mus::WmShellMus::RecordUserMetricsAction(ash::UserMetricsAction)
[13689:13689:0111/211551.644306:603963161:ERROR:shell_delegate_mus.cc(168)] Not implemented reached in virtual std::unique_ptr<ash::PaletteDelegate> ash::ShellDelegateMus::CreatePaletteDelegate()
[13689:13689:0111/211551.644525:603963380:ERROR:wm_shell_mus.cc(396)] Not implemented reached in virtual void ash::mus::WmShellMus::AddDisplayObserver(ash::WmDisplayObserver*)

When I run locally on cave I get:

terminate called after throwing an instance of 'std::out_of_range'
  what():  vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)
Received signal 6
#0 0x628cb852fc76 base::debug::StackTrace::StackTrace()
#1 0x628cb8530070 base::debug::(anonymous namespace)::StackDumpSignalHandler()
#2 0x724cdaf50530 <unknown>
#3 0x724cd9bdce32 gsignal
#4 0x724cd9bdec56 abort
#5 0x724cda4d2625 __gnu_cxx::__verbose_terminate_handler()
#6 0x724cda4d0366 <unknown>
#7 0x724cda4d03b1 std::terminate()
#8 0x724cda4d0621 __cxa_throw
#9 0x724cda52d5f1 std::__throw_out_of_range_fmt()
#10 0x628cb810a6ce ash::mus::WmShellMus::GetPrimaryRootWindow()
#11 0x628cba4ecf9a ash::WmShell::GetPrimaryRootWindowController()
#12 0x628cb8114f51 ash::mus::WindowManager::GetPrimaryRootWindowController()
#13 0x628cb811544f ash::mus::WindowManager::Shutdown()
#14 0x628cb85ee612 mojo::InterfaceEndpointClient::NotifyError()
#15 0x628cb85f12ae mojo::internal::MultiplexRouter::ProcessNotifyErrorTask()
#16 0x628cb85f3503 mojo::internal::MultiplexRouter::ProcessTasks()
#17 0x628cb85f4558 mojo::internal::MultiplexRouter::OnPipeConnectionError()
#18 0x628cb85ec16a mojo::Connector::HandleError()
#19 0x628cb85f9bb8 mojo::Watcher::OnHandleReady()
#20 0x628cb85cc57b base::debug::TaskAnnotator::RunTask()
#21 0x628cb85503bf base::MessageLoop::RunTask()
#22 0x628cb855084e base::MessageLoop::DeferOrRunPendingTask()
#23 0x628cb8552d08 base::MessageLoop::DoWork()
#24 0x628cb8553269 base::MessagePumpLibevent::Run()
#25 0x628cb855013a base::MessageLoop::RunHandler()
#26 0x628cb8576cdd base::RunLoop::Run()
#27 0x628cb65d74bf MashRunner::StartChildApp()
#28 0x628cb65d5a94 _ZN4base8internal7InvokerINS0_9BindStateIM10MashRunnerFvN4mojo16InterfaceRequestIN15service_manager5mojom7ServiceEEEEJNS0_17UnretainedWrapperIS3_EEEEEFvS9_EE3RunEPNS0_13BindStateBaseEOS9_
#29 0x628cb697adf5 service_manager::RunStandaloneService()
#30 0x628cb65d69ec MashRunner::RunChild()
#31 0x628cb65d6b6a MashMain()
#32 0x628cb65d3f35 ChromeMain
#33 0x724cd9bc9796 __libc_start_main
#34 0x628cb65d3cd9 _start
  r8: 0000724cd9f4f8e0  r9: 0000724cd984e740 r10: 0000000000000008 r11: 0000000000000202
 r12: 0000724cd9f4e7a0 r13: 000016b32b328e80 r14: 0000000000000000 r15: 000016b32b2aeb40
  di: 0000000000006929  si: 0000000000006929  bp: 00007ffcc9725a80  bx: 000016b32b320c18
  dx: 0000000000000006  ax: 0000000000000000  cx: ffffffffffffffff  sp: 00007ffcc9725958
  ip: 0000724cd9bdce32 efl: 0000000000000202 cgf: 0000000000000033 erf: 0000000000000000
 trp: 0000000000000000 msk: 0000000000014003 cr2: 0000000000000000
[end of stack trace]

This seems odd, like maybe it's a secondary problem. Ash isn't finding a RootWindowController. The stack looks like it is responding to something else going down, like the window server.

 
Blocking: 680398
peach_pit builders with a 12-ish hour regression range (there were some compile failures last night):

https://uberchromegw.corp.google.com/i/chromeos.chrome/builders/peach_pit-tot-chrome-pfq-informational

Issue 680398 has links to logs from autotest if that would be helpful.

Project Member

Comment 3 by bugdroid1@chromium.org, Jan 12 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/76587234139b667292d0488fbcafe7dd0c593abf

commit 76587234139b667292d0488fbcafe7dd0c593abf
Author: James Cook <jamescook@chromium.org>
Date: Thu Jan 12 22:02:04 2017

Move desktopui_MashLogin autotest off bvt-cq

The ash process in chrome --mash is crash-looping on startup.
Take the test off the CQ until we can sort that out.

BUG= chromium:680700 
TEST=desktopui_MashLogin

Change-Id: I13bad4a5d5c7b5295d0dab0687cc747e8cb38250
Reviewed-on: https://chromium-review.googlesource.com/427412
Tested-by: James Cook <jamescook@chromium.org>
Trybot-Ready: James Cook <jamescook@chromium.org>
Reviewed-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Ilja H. Friedel <ihf@chromium.org>

[modify] https://crrev.com/76587234139b667292d0488fbcafe7dd0c593abf/client/site_tests/desktopui_MashLogin/control

Cc: roc...@chromium.org
Running a debug chrome on cave I see this stack before the ash one in comment #1.

[8126:8160:0112/143705.477463:532219562:FATAL:interface_endpoint_client.cc(32)] Check failed: !is_valid. The callback passed to Connector::BindInterface() was never run.
#0 0x72b60ba0edf5 base::debug::StackTrace::StackTrace()
#1 0x72b60ba736d1 logging::LogMessage::~LogMessage()
#2 0x72b60c2bf190 mojo::(anonymous namespace)::DCheckIfInvalid()
#3 0x72b60c2bf787 mojo::(anonymous namespace)::ResponderThunk::DCheckInvalid()
#4 0x57b18b4db0a4 service_manager::mojom::Connector_BindInterface_ProxyToResponder::~Connector_BindInterface_ProxyToResponder()
#5 0x57b18b4dc588 std::default_delete<>::operator()()
#6 0x57b18b4db607 std::unique_ptr<>::~unique_ptr()
#7 0x57b18b4daf20 base::internal::PassedWrapper<>::~PassedWrapper()
#8 0x57b18b4dd29a std::_Head_base<>::~_Head_base()
#9 0x57b18b4dd2f0 _ZNSt11_Tuple_implILm0EJN4base8internal13PassedWrapperISt10unique_ptrIN15service_manager5mojom40Connector_BindInterface_ProxyToResponderESt14default_deleteIS6_EEE
EEED2Ev
#10 0x57b18b4dd33a _ZNSt5tupleIJN4base8internal13PassedWrapperISt10unique_ptrIN15service_manager5mojom40Connector_BindInterface_ProxyToResponderESt14default_deleteIS6_EEEEEED2Ev
#11 0x57b18b4dd4cc _ZN4base8internal9BindStateIMN15service_manager5mojom40Connector_BindInterface_ProxyToResponderEFvNS3_13ConnectResultERKSsEJNS0_13PassedWrapperISt10unique_ptrIS4
_St14default_deleteIS4_EEEEEED2Ev
#12 0x57b18b4dd6ad _ZN4base8internal9BindStateIMN15service_manager5mojom40Connector_BindInterface_ProxyToResponderEFvNS3_13ConnectResultERKSsEJNS0_13PassedWrapperISt10unique_ptrIS4
_St14default_deleteIS4_EEEEEE7DestroyEPKNS0_13BindStateBaseE
#13 0x72b60b9de5fa base::internal::BindStateBase::Release()
#14 0x72b60b9deca0 scoped_refptr<>::Release()
#15 0x72b60b9dead5 scoped_refptr<>::~scoped_refptr()
#16 0x72b60b9de92c base::internal::CallbackBase<>::~CallbackBase()
#17 0x57b18b2e9f72 base::internal::CallbackBase<>::~CallbackBase()
#18 0x57b18b2e9fc2 base::Callback<>::~Callback()
#19 0x57b18b4d9f85 service_manager::mojom::ConnectorStubDispatch::AcceptWithResponder()
#20 0x57b18cfd3421 service_manager::mojom::ConnectorStub<>::AcceptWithResponder()
#21 0x72b60c2c1666 mojo::InterfaceEndpointClient::HandleValidatedMessage()
#22 0x72b60c2bf93b mojo::InterfaceEndpointClient::HandleIncomingMessageThunk::Accept()
#23 0x72b60c2bd031 mojo::FilterChain::Accept()
#24 0x72b60c2c115b mojo::InterfaceEndpointClient::HandleIncomingMessage()
#25 0x72b60c2d30a3 mojo::internal::MultiplexRouter::ProcessIncomingMessage()
#26 0x72b60c2d15fd mojo::internal::MultiplexRouter::Accept()
#27 0x72b60c2bd031 mojo::FilterChain::Accept()
#28 0x72b60c2b27df mojo::Connector::ReadSingleMessage()
#29 0x72b60c2b290c mojo::Connector::ReadAllAvailableMessages()
#30 0x72b60c2b2248 mojo::Connector::OnHandleReadyInternal()
#31 0x72b60c2b1fbc mojo::Connector::OnWatcherHandleReady()
#32 0x72b60c2b53d8 _ZN4base8internal13FunctorTraitsIMN4mojo9ConnectorEFvjEvE6InvokeIPS3_JjEEEvS5_OT_DpOT0_
#33 0x72b60c2b5154 _ZN4base8internal12InvokeHelperILb0EvE8MakeItSoIRKMN4mojo9ConnectorEFvjEJPS5_jEEEvOT_DpOT0_
#34 0x72b60c2b4a44 _ZN4base8internal7InvokerINS0_9BindStateIMN4mojo9ConnectorEFvjEJNS0_17UnretainedWrapperIS4_EEEEEFvjEE7RunImplIRKS6_RKSt5tupleIJS8_EEJLm0EEEEvOT_OT0_NS_13IndexSeq
uenceIJXspT1_EEEEOj
#35 0x72b60c2b47d9 _ZN4base8internal7InvokerINS0_9BindStateIMN4mojo9ConnectorEFvjEJNS0_17UnretainedWrapperIS4_EEEEEFvjEE3RunEPNS0_13BindStateBaseEOj
#36 0x72b60c25fd18 base::internal::RunMixin<>::Run()
#37 0x72b60c25f575 mojo::Watcher::OnHandleReady()
#38 0x72b60c2607f0 _ZN4base8internal13FunctorTraitsIMN4mojo7WatcherEFvjEvE6InvokeIRKNS_7WeakPtrIS3_EEJRKjEEEvS5_OT_DpOT0_
#39 0x72b60c26064c _ZN4base8internal12InvokeHelperILb1EvE8MakeItSoIRKMN4mojo7WatcherEFvjERKNS_7WeakPtrIS5_EEJRKjEEEvOT_OT0_DpOT1_
#40 0x72b60c2601a9 _ZN4base8internal7InvokerINS0_9BindStateIMN4mojo7WatcherEFvjEJNS_7WeakPtrIS4_EEjEEEFvvEE7RunImplIRKS6_RKSt5tupleIJS8_jEEJLm0ELm1EEEEvOT_OT0_NS_13IndexSequenceIJX
spT1_EEEE
#41 0x72b60c25ff85 _ZN4base8internal7InvokerINS0_9BindStateIMN4mojo7WatcherEFvjEJNS_7WeakPtrIS4_EEjEEEFvvEE3RunEPNS0_13BindStateBaseE
#42 0x72b60ba12dd6 _ZNO4base8internal8RunMixinINS_8CallbackIFvvELNS0_8CopyModeE0ELNS0_10RepeatModeE0EEEE3RunEv
#43 0x72b60ba12839 base::debug::TaskAnnotator::RunTask()
#44 0x72b60ba9f9b4 base::MessageLoop::RunTask()
#45 0x72b60ba9fb10 base::MessageLoop::DeferOrRunPendingTask()
#46 0x72b60baa0059 base::MessageLoop::DoWork()
#47 0x72b60babb808 base::MessagePumpDefault::Run()
#48 0x72b60ba9f590 base::MessageLoop::RunHandler()
#49 0x72b60bb53f1d base::RunLoop::Run()
#50 0x57b18cfff637 service_manager::BackgroundServiceManager::MojoThread::Run()
#51 0x72b60bbf9059 base::SimpleThread::ThreadMain()
#52 0x72b60bbe42fb base::(anonymous namespace)::ThreadFunc()
#53 0x72b60bef0578 <unknown>
#54 0x72b5f231f6dd clone

pid 8126 seems to be the parent process of ash. I'm not sure what this trace means.

[8126:8162:0112/143701.475668:528217994:INFO:service_process_launcher.cc(211)] Launched child process pid=8163, instance=b365bdca-e533-40c7-b44c-1cdb30ba40d8, name=chrome_mash, use
r_id=505C0EE9-3013-43C0-82B0-A84F50CF8D84

Comment 5 by roc...@chromium.org, Jan 12 2017

This means someone is dropping a message response callback without either
running it or closing the binding which received the corresponding request.

Comment 6 by roc...@chromium.org, Jan 12 2017

Cc: ben@chromium.org
Almost certainly https://codereview.chromium.org/2610853013

Comment 7 by roc...@chromium.org, Jan 12 2017

I've commented on the above CL regarding the likely source of trouble.
I'll try reverting that CL locally and see what happens.

chrome --mash starts normally with that CL reverted locally.
I reverted https://codereview.chromium.org/2610853013 last night.
Owner: jamescook@chromium.org
Status: Verified (was: Assigned)
Thanks for the fix! I can verify mustash works on a Pixel at ToT this morning.
Project Member

Comment 12 by bugdroid1@chromium.org, Feb 15 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/b3aacdfefdf522134883c6cf8d0ffbb3d097c928

commit b3aacdfefdf522134883c6cf8d0ffbb3d097c928
Author: James Cook <jamescook@chromium.org>
Date: Wed Feb 15 18:47:31 2017

Put desktopui_MashLogin back into bvt-cq

The chrome --mash startup crash has been resolved and the test has
been stable on bvt-perbuild for 2 weeks. Put it back in bvt-cq so
it runs on the Chrome PFQ and ToT PFQ informational builders.

BUG= chromium:680700 
TEST=desktopui_MashLogin

Change-Id: I29e6c6ac085f2bc4f8058bd346f7da5b05a96d2f
Reviewed-on: https://chromium-review.googlesource.com/442669
Commit-Ready: James Cook <jamescook@chromium.org>
Tested-by: James Cook <jamescook@chromium.org>
Reviewed-by: Dan Erat <derat@chromium.org>

[modify] https://crrev.com/b3aacdfefdf522134883c6cf8d0ffbb3d097c928/client/site_tests/desktopui_MashLogin/control

Components: -Internals>MUS Internals>Services>WindowService

Sign in to add a comment