New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 666632 link

Starred by 1 user

Issue metadata

Status: Duplicate
Merged: issue 666481
Owner:
Last visit > 30 days ago
Closed: Nov 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: ----
Type: ----



Sign in to add a comment

mash_browser_tests failing on chromium.chromiumos/Linux ChromiumOS Tests (1)

Project Member Reported by mgiuca@chromium.org, Nov 18 2016

Issue description

mash_browser_tests failing on chromium.chromiumos/Linux ChromiumOS Tests (1)

Type: build-failure

Builders failed on: 
- Linux ChromiumOS Tests (1): 
  https://build.chromium.org/p/chromium.chromiumos/builders/Linux%20ChromiumOS%20Tests%20%281%29

This just started happening in https://build.chromium.org/p/chromium.chromiumos/builders/Linux%20ChromiumOS%20Tests%20%281%29/builds/29606

Last known good: r432920
First known bad: r432939

The previous mash failures were a different stack trace.

[1117/194428:ERROR:gpu_service_internal.cc(87)] Not implemented reached in virtual void ui::GpuServiceInternal::DidCreateOffscreenContext(const GURL &)
[1117/194431:ERROR:gpu_service_internal.cc(92)] Not implemented reached in virtual void ui::GpuServiceInternal::DidDestroyChannel(int)
[1117/194431:ERROR:gpu_service_internal.cc(96)] Not implemented reached in virtual void ui::GpuServiceInternal::DidDestroyOffscreenContext(const GURL &)
Received signal 11 <unknown> 000000000000
#0 0x000002cd5b27 [1117/194431:ERROR:wm_shell_mus.cc(405)] Not implemented reached in virtual void ash::mus::WmShellMus::RemoveDisplayObserver(ash::WmDisplayObserver *)
base::debug::(anonymous namespace)::StackDumpSignalHandler()
#1 0x7f00cedc3cb0 <unknown>
#2 0x000006677251 ui::ws::GpuCompositorFrameSink::DidReceiveCompositorFrameAck()
#3 0x0000042b2f63 cc::Surface::RunDrawCallbacks()
#4 0x0000042b477b cc::SurfaceAggregator::ProcessAddedAndRemovedSurfaces()
#5 0x0000042b8551 cc::SurfaceAggregator::Aggregate()
#6 0x0000042aeb76 cc::Display::DrawAndSwap()
#7 0x0000042b0871 cc::DisplayScheduler::DrawAndSwap()
#8 0x0000042afcf6 cc::DisplayScheduler::OnBeginFrameDeadline()
#9 0x000002d4bdfe base::debug::TaskAnnotator::RunTask()
#10 0x000002cede6c base::MessageLoop::RunTask()
#11 0x000002cee118 base::MessageLoop::DeferOrRunPendingTask()
#12 0x000002cee3fb base::MessageLoop::DoWork()
#13 0x000002cef44a base::MessagePumpDefault::Run()
#14 0x000002cedc18 base::MessageLoop::RunHandler()
#15 0x000002d08a70 base::RunLoop::Run()
#16 0x000002cca72b (anonymous namespace)::StartChildApp()
#17 0x000001735334 _ZN4base8internal7InvokerINS0_9BindStateIPFvN4mojo16InterfaceRequestIN3ash5mojom15NewWindowClientEEEEJEEES9_E3RunEPNS0_13BindStateBaseEOS8_
#18 0x000002889cc3 service_manager::ChildProcessMainWithCallback()
#19 0x000002cca436 RunMashBrowserTests()
#20 0x000002cca303 main
#21 0x7f00ca2d176d __libc_start_main
#22 0x00000064c391 <unknown>
  r8: 0000000000000001  r9: 0000000000000001 r10: 656b6568d9c678d0 r11: b4eb841cec47bd2a
 r12: 000036b82b590000 r13: 000036b8360735d0 r14: 000036b82dcab968 r15: 00007fff0ac8a388
  di: 63f772ffe95fe548  si: 00007fff0ac8a388  bp: 0000000000001000  bx: 000036b82dcab800
  dx: 000000000000002f  ax: 000036b831198c80  cx: 0000000006677220  sp: 00007fff0ac8a360
  ip: 0000000006677251 efl: 0000000000010202 cgf: 0000000000000033 erf: 0000000000000000
 trp: 000000000000000d msk: 0000000000000000 cr2: 0000000000000000
[end of stack trace]
[1117/194432:ERROR:native_widget_mus.cc(859)] Not implemented reached in virtual void views::NativeWidgetMus::ViewRemoved(views::View *)
[1117/194432:ERROR:session.cc(20)] Restarting service: quick_launch
[1117/194432:ERROR:interface_registry.cc(210)] Failed to locate a binder for interface: tracing::mojom::Factory requested by: quick_launch exposed by: tracing via InterfaceProviderSpec "service_manager:connector".

I can't see an obvious culprit from the 9 CLs in the list.
 
Cc: msramek@chromium.org
This appeared in 8 out of last 13 builds, so it's a flake. The last three are green, so it maybe even went away.

I compared the failing and passing runs. You pasted the failing one, so here's the passing one:

https://build.chromium.org/p/chromium.chromiumos/builders/Linux%20ChromiumOS%20Tests%20%281%29/builds/29616/steps/mash_browser_tests/logs/stdio

============================================================================

[1118/022611:ERROR:gpu_service_internal.cc(87)] Not implemented reached in virtual void ui::GpuServiceInternal::DidCreateOffscreenContext(const GURL &)
[1118/022613:ERROR:gpu_service_internal.cc(92)] Not implemented reached in virtual void ui::GpuServiceInternal::DidDestroyChannel(int)
[1118/022613:ERROR:gpu_service_internal.cc(96)] Not implemented reached in virtual void ui::GpuServiceInternal::DidDestroyOffscreenContext(const GURL &)
[1118/022613:ERROR:wm_shell_mus.cc(405)] Not implemented reached in virtual void ash::mus::WmShellMus::RemoveDisplayObserver(ash::WmDisplayObserver *)
[1/1] BrowserTest.Title (10237 ms)
SUCCESS: all tests passed.
[1118/022614:ERROR:native_widget_mus.cc(859)] Not implemented reached in virtual void views::NativeWidgetMus::ViewRemoved(views::View *)
[1118/022614:ERROR:session.cc(20)] Restarting service: quick_launch
[1118/022614:ERROR:session.cc(20)] Restarting service: ash
Received signal 11 SEGV_MAPERR fffffffffffff352
#0 0x000002cd6db7 base::debug::(anonymous namespace)::StackDumpSignalHandler()
#1 0x7f79b0cdbcb0 <unknown>
#2 0x000001b3bd39 gpu::GpuChannelHost::Send()
#3 0x000001b3bf59 gpu::GpuChannelHost::OrderingBarrier()
...
#24 0x000002ccb6c6 RunMashBrowserTests()

============================================================================

In both cases, we get signal 11 (i.e. segfault).

In the passing one, we get it after BrowserTest.Title has already successfully finished. In the failing one, we get it slightly earlier - before RemoveDisplayObserver(), and before the test is finished, and thus the test is not successful.

So we always get a segfault, the only difference is the timing. One of the recent CLs uncovered the problem, but didn't cause it. The segfault is the problem, and the four NOTREACHED() instances also don't seem like something that should appear in a passing test... though I don't know yet if it's related.
The very old https://build.chromium.org/p/chromium.chromiumos/builders/Linux%20ChromiumOS%20Tests%20%281%29/builds/29421/steps/mash_browser_tests/logs/stdio already contains the 4 NOTREACHED()s but not the segfault, so they're not related.
Build 29510 only has NOTREACHED()
Build 29511 only has segfault
Build 29512 has both, and this state persists until today, even though the tests mostly pass
And actually, 29511 crashed with the same stack-trace as in comment #0. It's still the same problem.

Re #0: "The previous mash failures were a different stack trace." -> I am inclined to believe that this is just because the test was killed at a different time, as I mentioned in #1.
Owner: ben@chromium.org
Status: Assigned (was: Available)
So, 29511 contained a single CL touching mash/, which is https://codereview.chromium.org/2503063003.

It looks like it's just replacing a string literal with a constant, but the stack trace begins in DidReceiveCompositorFrameAck inside src/services/ui/, which is in the core of this CL.

Assigning to ben@, the author, PTAL. Even if this is a red herring, you're an owner of both mash/ and services/ui, so you could at least have an idea what's going on.
Labels: -Sheriff-Chromium
Removing Sheriff-Chromium in the meantime.

Comment 7 by est...@chromium.org, Nov 18 2016

Mergedinto: 666481
Status: Duplicate (was: Assigned)
this crash stack has actually been showing up for a while, though not consistently

Sign in to add a comment