New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 742989 link

Starred by 1 user

Issue metadata

Status: Duplicate
Merged: issue 742966
Owner:
OOO until 2019-01-24
Closed: Jul 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 0
Type: Bug-Regression



Sign in to add a comment

mash_browser_tests flaky on Linux ChromiumOS Ozone Tests continuous and try bot

Project Member Reported by machenb...@chromium.org, Jul 14 2017

Issue description

Cc: fhorschig@chromium.org qyears...@chromium.org alancutter@chromium.org pmonette@chromium.org
Looks also quite bad on CI bot:
https://build.chromium.org/p/chromium.chromiumos/builders/Linux%20ChromiumOS%20Ozone%20Tests%20(1)

Guess this is rather a sheriff issue. CC'ing all sheriffs.
Cc: -pmonette@chromium.org -qyears...@chromium.org -alancutter@chromium.org -fhorschig@chromium.org
Labels: Sheriff-Chromium
Applied Sheriff-Chromium label so swapped/following sheriffs will also know about this.

I assume this might be a sheriff issue because the last three failures always included:
ExperimentalAppWindowApiTest.SetIcon
and either 
CreateNewFolder/FileManagerBrowserTest.Test/2 or 
CreateNewFolder/FileManagerBrowserTest.Test/0 or both

First with failing SetIcon test:
https://build.chromium.org/p/chromium.chromiumos/builders/Linux%20ChromiumOS%20Ozone%20Tests%20%281%29/builds/49753
or if the crashed bot hides the problem:
https://build.chromium.org/p/chromium.chromiumos/builders/Linux%20ChromiumOS%20Ozone%20Tests%20%281%29/builds/49752

First with FileManagerBrowserTest failure:
https://build.chromium.org/p/chromium.chromiumos/builders/Linux%20ChromiumOS%20Ozone%20Tests%20%281%29/builds/49738
or if the crashed bot hides the problem:
https://build.chromium.org/p/chromium.chromiumos/builders/Linux%20ChromiumOS%20Ozone%20Tests%20%281%29/builds/49736 or
https://build.chromium.org/p/chromium.chromiumos/builders/Linux%20ChromiumOS%20Ozone%20Tests%20%281%29/builds/49737
Update to ExperimentalAppWindowApiTest.SetIcon:
The actual first failure was
https://build.chromium.org/p/chromium.chromiumos/builders/Linux%20ChromiumOS%20Ozone%20Tests%20%281%29/builds/49748
and in between, there have been failing tests without SetIcon failing.

Are we dealing with two extremly flaky tests here?
The logs contain error notes from the diagnostic writer which complain about:
PathUserData (Cannot obtain size for: /b/s/w/ito0_6k4/.org.chromium.Chromium.jI7Bie/dimCst8)
PathLocalState (Path not found: /b/s/w/ito0_6k4/.org.chromium.Chromium.jI7Bie/dimCst8/Local State)

... can this be configuration/permission issue? (Given that these tests are browser_tests which spin up a whole Chromium instance.
Labels: -Infra-Troopers
Cc: kmarshall@chromium.org
Owner: qyears...@chromium.org
Status: Assigned (was: Untriaged)
Summary: mash_browser_tests flaky on Linux ChromiumOS Ozone Tests continuous and try bot (was: mash_browser_tests flaky timeout on linux_chromium_chromeos_ozone_rel_ng)
Latest builds for continuous builder: https://build.chromium.org/p/chromium.chromiumos/builders/Linux%20ChromiumOS%20Ozone%20Tests%20%281%29?numbuilds=200

Latest builds for try bot: https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_chromeos_ozone_rel_ng?numbuilds=200

Flakiness dashboard: https://test-results.appspot.com/dashboards/flakiness_dashboard.html#testType=mash_browser_tests

Flakiness dashboard now shows that 
  ExperimentalAppWindowApiTest.SetIcon
  CreateNewFolder/FileManagerBrowserTest.Test/{0,1,2,3}
have become flaky.

kmarshall@: Do you think that https://crrev.com/c/569418/ ("Remove File::Lock() and Unlock() under Fuchsia") or https://crrev.com/c/569845 (Reland "Remove unsupported perm, symlink calls for Fuchsia."). Do changes for Fuchsia affect "ChromeOS Ozone"?

There are also other possible changes around the start of the flakiness that could be worth looking at. Still not sure whether all those flaky test failures are related to the timeout exceptions.

Comment 6 by kbr@chromium.org, Jul 14 2017

Cc: dpranke@chromium.org reve...@chromium.org e...@chromium.org qyears...@chromium.org fsam...@chromium.org
Components: Internals>Compositing
Labels: -Pri-1 OS-Chrome Pri-0
Owner: kbr@chromium.org
The timeouts are happening because the compositor is asserting while it's being created. See the stack trace below.

This flaky failure is happening on a majority of tryjobs on this bot at this point. Just a few examples:
https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Ftryserver.chromium.linux%2Flinux_chromium_chromeos_ozone_rel_ng%2F429427%2F%2B%2Frecipes%2Fsteps%2Fmash_browser_tests__with_patch_%2F0%2Fstdout
https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Ftryserver.chromium.linux%2Flinux_chromium_chromeos_ozone_rel_ng%2F429421%2F%2B%2Frecipes%2Fsteps%2Fmash_browser_tests__with_patch_%2F0%2Fstdout
https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Ftryserver.chromium.linux%2Flinux_chromium_chromeos_ozone_rel_ng%2F429418%2F%2B%2Frecipes%2Fsteps%2Fmash_browser_tests__with_patch_%2F0%2Fstdout
https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Ftryserver.chromium.linux%2Flinux_chromium_chromeos_ozone_rel_ng%2F429416%2F%2B%2Frecipes%2Fsteps%2Fmash_browser_tests__with_patch_%2F0%2Fstdout
https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Ftryserver.chromium.linux%2Flinux_chromium_chromeos_ozone_rel_ng%2F429427%2F%2B%2Frecipes%2Fsteps%2Fmash_browser_tests__with_patch_%2F0%2Fstdout
https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Ftryserver.chromium.linux%2Flinux_chromium_chromeos_ozone_rel_ng%2F429427%2F%2B%2Frecipes%2Fsteps%2Fmash_browser_tests__with_patch_%2F0%2Fstdout

I think the appropriate thing to do given the severity of these errors and given that they've been happening since yesterday is to temporarily disable this test suite on this configuration until the cause is known. I'm preparing a CL to do this now.


[0714/094529.480934:FATAL:compositor.cc(498)] Check failed: false. 
#0 0x0000037b63ec base::debug::StackTrace::StackTrace()
#1 0x0000037ce5b1 logging::LogMessage::~LogMessage()
#2 0x0000055ea3ec ui::Compositor::DidFailToInitializeLayerTreeFrameSink()
#3 0x000004ee95af cc::LayerTreeHost::DidFailToInitializeLayerTreeFrameSink()
#4 0x000004ef4b18 cc::SingleThreadProxy::SetLayerTreeFrameSink()
#5 0x000004ee9111 cc::LayerTreeHost::SetLayerTreeFrameSink()
#6 0x0000055e9682 ui::Compositor::SetLayerTreeFrameSink()
#7 0x0000055e6b4e aura::MusContextFactory::OnEstablishedGpuChannel()
#8 0x0000055e6ede _ZN4base8internal13FunctorTraitsIMN4aura17MusContextFactoryEFvNS_7WeakPtrIN2ui10CompositorEEE13scoped_refptrIN3gpu14GpuChannelHostEEEvE6InvokeIRKNS4_IS3_EEJRKS7_SB_EEEvSD_OT_DpOT0_
#9 0x00000216b18f ui::Gpu::OnEstablishedGpuChannel()
#10 0x00000216b8a9 _ZN4base8internal7InvokerINS0_9BindStateIMN2ui3GpuEFviN4mojo16ScopedHandleBaseINS5_17MessagePipeHandleEEERKN3gpu7GPUInfoEEJNS0_17UnretainedWrapperIS4_EEEEEFviS8_SC_EE3RunEPNS0_13BindStateBaseEOiOS8_SC_
#11 0x0000020ac102 ui::mojom::Gpu_EstablishGpuChannel_ForwardToCallback::Accept()
#12 0x000004d67c55 mojo::InterfaceEndpointClient::HandleValidatedMessage()
#13 0x000004d67836 mojo::FilterChain::Accept()
#14 0x000004d68d35 mojo::InterfaceEndpointClient::HandleIncomingMessage()
#15 0x000004d6fd43 mojo::internal::MultiplexRouter::ProcessIncomingMessage()
#16 0x000004d6f58f mojo::internal::MultiplexRouter::Accept()
#17 0x000004d67836 mojo::FilterChain::Accept()
#18 0x000004d66b25 mojo::Connector::ReadSingleMessage()
#19 0x000004d67421 mojo::Connector::ReadAllAvailableMessages()
#20 0x000004d672d9 mojo::Connector::OnHandleReadyInternal()
#21 0x0000019f5295 chromeos::file_system_provider::Int64ToIntCompletionCallback()
#22 0x000004d7dd24 mojo::SimpleWatcher::OnHandleReady()
#23 0x000004d7e1b3 _ZN4base8internal7InvokerINS0_9BindStateIMN4mojo13SimpleWatcherEFvijRKNS3_18HandleSignalsStateEEJNS_7WeakPtrIS4_EEijS5_EEEFvvEE7RunImplIRKS9_RKSt5tupleIJSB_ijS5_EEJLm0ELm1ELm2ELm3EEEEvOT_OT0_NS_13IndexSequenceIJXspT1_EEEE
#24 0x000003862e40 base::debug::TaskAnnotator::RunTask()
#25 0x0000037d58d9 base::MessageLoop::RunTask()
#26 0x0000037d5b8b base::MessageLoop::DeferOrRunPendingTask()
#27 0x0000037d5f94 base::MessageLoop::DoWork()
#28 0x0000037d8459 base::MessagePumpLibevent::Run()
#29 0x0000037d550b base::MessageLoop::Run()
#30 0x0000037ff83a base::RunLoop::Run()
#31 0x0000037a72d2 (anonymous namespace)::StartEmbeddedService()
#32 0x0000037a7b0d _ZN4base8internal7InvokerINS0_9BindStateIPFvN4mojo16InterfaceRequestIN15service_manager5mojom7ServiceEEEEJEEES9_E3RunEPNS0_13BindStateBaseEOS8_
#33 0x0000021b5b02 service_manager::RunStandaloneService()
#34 0x0000037a6efe RunMashBrowserTests()
#35 0x0000037a6d57 main
#36 0x7f56e3788f45 __libc_start_main
#37 0x000000627d9e <unknown>

Comment 7 by kbr@chromium.org, Jul 14 2017

Status: Started (was: Assigned)
https://chromium-review.googlesource.com/572032/ is up for review temporarily disabling this test suite on CrOS / Ozone. A revert can be sent through the trybots multiple times to see whether the flakiness has been addressed.

Comment 8 by kbr@chromium.org, Jul 14 2017

Also, in case the severity of this problem (and  Issue 742596 ) isn't clear, here's a screenshot of the current state of this trybot.

Screen Shot 2017-07-14 at 10.30.37 AM.png
736 KB View Download
Thanks kbr@!

I think that the check failure in compositor (causing timeouts) might be separate from the flaky test failures (in ExperimentalAppWindowApiTest.SetIcon CreateNewFolder/FileManagerBrowserTest.Test) -- if so, those issues would be easier to track down after the cause of the compositor check failure is found.

Comment 10 by kbr@chromium.org, Jul 14 2017

Mergedinto: 742966
Status: Duplicate (was: Started)
It turns out this was filed earlier and work's already underway to repair it.
Split out non-timeout failures into separate bug:  bug 743123 .

Sign in to add a comment