New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 693231 link

Starred by 2 users

Issue metadata

Status: Archived
Owner:
Closed: Feb 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

"BrowserTest.Title" is flaky

Project Member Reported by chromium...@appspot.gserviceaccount.com, Feb 16 2017

Issue description

"BrowserTest.Title" is flaky.

This issue was created automatically by the chromium-try-flakes app. Please find the right owner to fix the respective test/step and assign this issue to them. If the step/test is infrastructure-related, please add Infra-Troopers label and change issue status to Untriaged. When done, please remove the issue from Sheriff Bug Queue by removing the Sheriff-Chromium label.

We have detected 3 recent flakes. List of all flakes can be found at https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyHAsSBUZsYWtlIhFCcm93c2VyVGVzdC5UaXRsZQw.

Flaky tests should be disabled within 30 minutes unless culprit CL is found and reverted. Please see more details here: https://sites.google.com/a/chromium.org/dev/developers/tree-sheriffs/sheriffing-bug-queues#triaging-auto-filed-flakiness-bugs
 
Cc: fsam...@chromium.org sadrul@chromium.org sky@chromium.org mfomitchev@chromium.org
sky, sadrul, fsamuel, and mfomitchev: I'm unable to pin down the cause of this flake, is this something you know about / have seen before? Any guidance on next steps? The flake hasn't happened in 18 hours, I will keep an eye on it and if it starts happening again then I will disable the test.

This happens on with the target mash_browser_tests (with patch) on linux_chromium_chromeos_ozone_rel_ng.
Cc: kylec...@chromium.org
+kylechar 
Kyle and Fady have fixed some flakes in mash_browser_tests recently:  issue 666481  (see the last two comments). Dupe?

Comment 3 by sky@chromium.org, Feb 17 2017

Those comments were from a couple of months back. Is 690970 the fix you are referring to from Kyle? That landed a couple of days ago.

Here's output from https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_chromeos_ozone_rel_ng/builds/323697/steps/mash_browser_tests%20%28with%20patch%29/logs/stdio :

[0216/131833.787456:ERROR:gpu_watchdog_thread.cc(373)] The GPU process hung. Terminating after 10000 ms.
Received signal 11 SEGV_MAPERR 000000000000
#0 0x0000033f2c67 base::debug::StackTrace::StackTrace()
#1 0x0000033f27df base::debug::(anonymous namespace)::StackDumpSignalHandler()
#2 0x7f949a1c3330 <unknown>
#3 0x000004a6aed9 gpu::GpuWatchdogThread::DeliberatelyTerminateToRecoverFromHang()
#4 0x000000746f3b _ZN4base8internal13FunctorTraitsIMNS_18CancelableCallbackIFvvEEEKFvvEvE6InvokeIRKNS_7WeakPtrIS4_EEJEEEvS6_OT_DpOT0_
#5 0x00000349aab9 base::debug::TaskAnnotator::RunTask()
#6 0x0000034120bd base::MessageLoop::RunTask()
#7 0x0000034128be base::MessageLoop::DoDelayedWork()
#8 0x00000341406d base::MessagePumpDefault::Run()
#9 0x000003411e3f base::MessageLoop::RunHandler()
#10 0x0000034388cf base::RunLoop::Run()
#11 0x000003461b1b base::Thread::Run()
#12 0x000003461fae base::Thread::ThreadMain()
#13 0x00000345b2ac base::(anonymous namespace)::ThreadFunc()
#14 0x7f949a1bb184 start_thread
#15 0x7f949691737d clone
  r8: 00007f948a38d700  r9: 00001d98fd774f48 r10: cccccccccccccccd r11: 0000000000000000
 r12: 000000000b452120 r13: 00007f948a38c110 r14: 00007f948a38bd70 r15: 00007f948a38bd78
  di: 00007f948a38bea0  si: 0000000000000000  bp: 00007f948a38c468  bx: 00001d98fbfc1100
  dx: 0000000000000773  ax: 0000000000000011  cx: 0000000000000001  sp: 00007f948a38bd60
  ip: 0000000004a6aed9 efl: 0000000000010202 cgf: 0000000000000033 erf: 0000000000000006
 trp: 000000000000000e msk: 0000000000000000 cr2: 0000000000000000
[end of stack trace]

GPU process looks to be hanging?

A subsequent stack is:
[1:1:0216/131833.852504:FATAL:context_provider_command_buffer.cc(178)] Check failed: channel_. 
#0 0x0000033f2c67 base::debug::StackTrace::StackTrace()
#1 0x00000340a8fa logging::LogMessage::~LogMessage()
#2 0x000001ed345e ui::ContextProviderCommandBuffer::ContextProviderCommandBuffer()
#3 0x000001ed5c46 ui::Gpu::CreateContextProvider()
#4 0x00000757b3fb content::RenderThreadImpl::CreateCompositorFrameSink()
#5 0x0000075968d6 content::RenderWidget::CreateCompositorFrameSink()
#6 0x000007596930 content::RenderWidget::CreateCompositorFrameSink()
#7 0x000007694a60 content::RenderWidgetCompositor::RequestNewCompositorFrameSink()
#8 0x000004cda2eb cc::ProxyMain::RequestNewCompositorFrameSink()
#9 0x000000746f3b _ZN4base8internal13FunctorTraitsIMNS_18CancelableCallbackIFvvEEEKFvvEvE6InvokeIRKNS_7WeakPtrIS4_EEJEEEvS6_OT_DpOT0_
#10 0x00000349aab9 base::debug::TaskAnnotator::RunTask()
#11 0x000006307540 blink::scheduler::TaskQueueManager::ProcessTaskFromWorkQueue()
#12 0x0000063052a5 blink::scheduler::TaskQueueManager::DoWork()
#13 0x000001659952 _ZN4base8internal13FunctorTraitsIMN13safe_browsing31PhishingDOMFeatureExtractorTestEFvbEvE6InvokeIRKNS_7WeakPtrIS3_EEJbEEEvS5_OT_DpOT0_
#14 0x00000349aab9 base::debug::TaskAnnotator::RunTask()
#15 0x0000034120bd base::MessageLoop::RunTask()
#16 0x000003412755 base::MessageLoop::DoWork()
#17 0x0000034140c9 base::MessagePumpDefault::Run()
#18 0x000003411e3f base::MessageLoop::RunHandler()
#19 0x0000034388cf base::RunLoop::Run()
#20 0x0000075a27ce content::RendererMain()
#21 0x0000033a8a37 content::RunZygote()
#22 0x0000033a999c content::ContentMainRunnerImpl::Run()
#23 0x0000033a8360 content::ContentMain()
#24 0x000003ae1bcd content::LaunchTests()
#25 0x0000033e4c81 main
#26 0x7f6e236dff45 __libc_start_main
#27 0x00000060e981 <unknown>

Not sure if that is because of the earlier crash.
sadrul@ is the one who's most recently doing GPU service work.
It's quite possible that this is a "normal" part of the test, but since we don't have the process split yet, we're failing the test. Perhaps we can disable the GPU watchdog thread for now?
Labels: -Sheriff-Chromium
Owner: sadrul@chromium.org
Status: Available (was: Untriaged)
Sorry, I somehow misread the date of the comments. This doesn't seem related to the DCHECK crash in  issue 690970  either. 

I am removing Sheriff-Chromium label, since I think sheriff's job here is done.

Assigning to Sadrul based on Fady's comment.
Project Member

Comment 7 by chromium...@appspot.gserviceaccount.com, Feb 18 2017

Labels: Sheriff-Chromium
Detected 3 new flakes for test/step "BrowserTest.Title". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyHAsSBUZsYWtlIhFCcm93c2VyVGVzdC5UaXRsZQw. This message was posted automatically by the chromium-try-flakes app. Since flakiness is ongoing, the issue was moved back into Sheriff Bug Queue (unless already there).
Project Member

Comment 8 by chromium...@appspot.gserviceaccount.com, Feb 20 2017

Detected 3 new flakes for test/step "BrowserTest.Title". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyHAsSBUZsYWtlIhFCcm93c2VyVGVzdC5UaXRsZQw. This message was posted automatically by the chromium-try-flakes app.

Comment 9 by sadrul@chromium.org, Feb 21 2017

If the test is stuck long enough that the watchdog is kicking in, then something is probably pretty wrong, and disabling the watchdog will probably still end with the test timing out.

I am unable to repro the timeout/crash locally; the test passes fine for me. Although, maybe this has to do with ozone-x11, and xvfb, and the test not running with --use-test-config?
The flakes started a few days after I landed the Ozone X11 change but it could still be the problem. 

I took away the --use-test-config flag because I thought it got automatically added when the mash_browser_tests started. It looks like I was wrong about that though, or that has changed in the meantime, because I no longer see where it gets added.

I'll add the flag back and see if that fixes the flakes?
I have a CL in progress to add the flag. But yeah, that's something to try out.
Project Member

Comment 12 by chromium...@appspot.gserviceaccount.com, Feb 21 2017

Detected 4 new flakes for test/step "BrowserTest.Title". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyHAsSBUZsYWtlIhFCcm93c2VyVGVzdC5UaXRsZQw. This message was posted automatically by the chromium-try-flakes app.
Project Member

Comment 13 by bugdroid1@chromium.org, Feb 21 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e7a4615d55bf04fad380cd6da95b0f7ffed08bfa

commit e7a4615d55bf04fad380cd6da95b0f7ffed08bfa
Author: sadrul <sadrul@chromium.org>
Date: Tue Feb 21 22:14:18 2017

mash: Add --use-test-config for mash_browser_tests.

Add the test config flag, so that the window server (with x11 backend) is
initialized properly (i.e. set to turn on override_redirect flag for the
x11 windows it creates). This is a speculative fix for the flaky time
outs in mash_browser_tests.

BUG= 693231 

Review-Url: https://codereview.chromium.org/2707013003
Cr-Commit-Position: refs/heads/master@{#451850}

[modify] https://crrev.com/e7a4615d55bf04fad380cd6da95b0f7ffed08bfa/chrome/test/base/mash_browser_tests_main.cc

Comment 15 by treib@chromium.org, Feb 23 2017

Are there any other quick-fix candidates? If not, I'll disable the test in the meantime.
We can try switching the test back to Ozone headless. What do you think sadrul? I'll put up a CL if so.
Also very interesting to note that this test is also running on the 'Mojo ChromiumOS' FYI trybot. It's set to be able to use swarming there too but I guess since it's an FYI bot it doesn't. The test hung in the same way running there running without swarming:

https://build.chromium.org/p/chromium.fyi/builders/Mojo%20ChromiumOS/builds/12645
> We can try switching the test back to Ozone headless. What do you think sadrul?
> I'll put up a CL if so.

Let's try that. Thanks

Comment 19 by treib@chromium.org, Feb 23 2017

Labels: OS-Chrome
Status: Started (was: Available)
Pending CL: https://codereview.chromium.org/2715713003/
Cc: jonr...@chromium.org
Project Member

Comment 21 by bugdroid1@chromium.org, Feb 23 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/3de0c00180f67f978e98d3fe501d924a43440215

commit 3de0c00180f67f978e98d3fe501d924a43440215
Author: kylechar <kylechar@chromium.org>
Date: Thu Feb 23 20:25:40 2017

Switch mash_browser_tests back to Ozone headless.

The test has been flakey for the last few days. Switch back to Ozone
headless to see if it helps.

The test also runs on an FYI trybot. Change it so that test continues to
run Ozone X11 but it will now use osmesa instead of EGL. This will give
some indication of if it's Ozone X11 in general, something to do with
Ozone X11 EGL or unrelated.

BUG= 693231 

Review-Url: https://codereview.chromium.org/2715713003
Cr-Commit-Position: refs/heads/master@{#452606}

[modify] https://crrev.com/3de0c00180f67f978e98d3fe501d924a43440215/testing/buildbot/chromium.chromiumos.json
[modify] https://crrev.com/3de0c00180f67f978e98d3fe501d924a43440215/testing/buildbot/chromium.fyi.json

Comment 22 by treib@chromium.org, Feb 24 2017

Status: Fixed (was: Started)
That seems to have done the trick! Let's assume fixed until proven otherwise :)
Owner: kylec...@chromium.org
 Issue 656308  has been merged into this issue.

Comment 25 by dchan@google.com, Apr 17 2017

Labels: VerifyIn-59

Comment 26 by dchan@google.com, May 30 2017

Labels: VerifyIn-60
Labels: VerifyIn-61

Comment 28 by dchan@chromium.org, Oct 14 2017

Status: Archived (was: Fixed)

Sign in to add a comment