New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 630698 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Aug 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

"dbus_unittests (with patch)" is flaky

Project Member Reported by chromium...@appspot.gserviceaccount.com, Jul 22 2016

Issue description

"dbus_unittests (with patch)" is flaky.

This issue was created automatically by the chromium-try-flakes app. Please find the right owner to fix the respective test/step and assign this issue to them. If the step/test is infrastructure-related, please add Infra-Troopers label and change issue status to Untriaged. When done, please remove the issue from Sheriff Bug Queue by removing the Sheriff-Chromium label.

We have detected 3 recent flakes. List of all flakes can be found at https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyJgsSBUZsYWtlIhtkYnVzX3VuaXR0ZXN0cyAod2l0aCBwYXRjaCkM.


 
Cc: mpear...@chromium.org
Components: Infra
Labels: -Sheriff-Chromium Infra-Troopers
These failures related to an x-server already being active on this machine ("Server is already active for display :9").  Either that means a test wasn't cleaned up properly or tests are overlapping inappropriate.  In either case, it's what I'd call infrastructure problem.  Tagging appropriately.

---
Traceback (most recent call last):
  File "/b/c/b/linux_chromeos/src/infra/scripts/legacy/scripts/slave/runtest.py", line 560, in <module>
    sys.exit(main())
  File "/b/c/b/linux_chromeos/src/infra/scripts/legacy/scripts/slave/runtest.py", line 550, in main
    return _Main(options, args, extra_env)
  File "/b/c/b/linux_chromeos/src/infra/scripts/legacy/scripts/slave/runtest.py", line 231, in _Main
    'True'))
  File "/b/c/b/linux_chromeos/src/infra/scripts/legacy/scripts/slave/xvfb.py", line 106, in StartVirtualX
[Running on builder: "linux_chromium_chromeos_rel_ng"]
DBUS_SESSION_BUS_ADDRESS env var not found, starting dbus-launch
 setting DBUS_SESSION_BUS_ADDRESS to unix:abstract=/tmp/dbus-W4OQFvLbCB,guid=cd3fc51de4ba2435195e13a800000720
 setting DBUS_SESSION_BUS_PID to 24379
Verifying Xvfb is not running ...
Verifying Xvfb has started...
xdisplaycheck failed after 30 seconds.
xdisplaycheck output:
> Failed to connect to :9
Xvfb exited, code 1
Xvfb output:
> 
> Fatal server error:
> Server is already active for display 9
> 	If this server is no longer running, remove /tmp/.X9-lock
> 	and start again.
> 
Stopping Xvfb with pid 24383 ...
... killing failed, presuming unnecessary.
Xvfb pid file removed
 killed dbus-daemon with PID 24379
 cleared DBUS_SESSION_BUS_ADDRESS environment variable
    raise Exception(logs)
Exception: Failed to connect to :9
@@@STEP_CURSOR@dbus_unittests (with patch)@@@
step returned non-zero exit code: 1
@@@STEP_FAILURE@@@
---
Cc: satorux@chromium.org hashimoto@chromium.org
satorux@ - It looks like you added these. The first few failures look like python failures, are we doing any special setup for these tests? 

If the flakiness history is complete, 24 failures since Jan 2015 doesn't seem very high in the grand scheme of things...
Cc: tandrii@chromium.org
Components: -Infra
Labels: -Infra-Troopers
mpearson@ this shouldn't be in infrastructure team's component, and certainly not for troopers, given that this isn't urgent new issue.

stevenjb@ who would be best owner for this? even if with lower priority given grand scheme of things :)

Comment 4 by satorux@google.com, Jul 25 2016

stevenjb@: dbus_unittests is just a C++ binary and does not do any special setup, but it assumes that dbus-daemon is running on the machine, which is usually the case on Linux.

tandrii@ I think this was considered an infra problem because the failures were something to do with Xvfb setup in the test infra Python scripts.
Components: Infra>Labs
satorux@ thanks, I get it now. Well, then I'm adding Infra>Labs for Xvfb setup knowledge.
Project Member

Comment 6 by chromium...@appspot.gserviceaccount.com, Jul 25 2016

Labels: Sheriff-Chromium
Detected 3 new flakes for test/step "dbus_unittests (with patch)". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyJgsSBUZsYWtlIhtkYnVzX3VuaXR0ZXN0cyAod2l0aCBwYXRjaCkM. This message was posted automatically by the chromium-try-flakes app. Since flakiness is ongoing, the issue was moved back into Sheriff Bug Queue (unless already there).
Ping for this unassigned infra bug.
Owner: friedman@chromium.org
Status: Assigned (was: Untriaged)
You just need xfvb installed?
Project Member

Comment 9 by chromium...@appspot.gserviceaccount.com, Jul 26 2016

Detected 5 new flakes for test/step "dbus_unittests (with patch)". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyJgsSBUZsYWtlIhtkYnVzX3VuaXR0ZXN0cyAod2l0aCBwYXRjaCkM. This message was posted automatically by the chromium-try-flakes app.
> You just need xfvb installed?

I don't know what's needed, but it's definitely something about the xfvb configuration.  Note that the problem is flaky, so it's not likely the problem is a missing problem (otherwise how would the test pass sometime)?
Are you not closing your xfvb sessions after tests?  Do these bots reboot after tests?  Can you re-create the error?
>>>
Are you not closing your xfvb sessions after tests?  Do these bots reboot after tests?  Can you re-create the error?
>>>

I was the sheriff at the time this bug was filed.  I don't know anything about the dbus_unittests test.  I don't know if the dbus tests close their xfvb sessions after tests.  However, I'd guess this error could be caused by any other test the builder is running not closing its session after the test.  I'd hope that each test step would be setup as a blank slate; i.e., the infra code running the sequence of separate test suites would clean up all bad state between tests.

I have not tried recreating the error.

I don't think the bots reboot after these tests, at least judging by looking at the bot pages.

Labels: -Sheriff-Chromium
Not sure if the sheriffs can provide any more value on this bug. Removing the Sheriff-Chromium label.

This sounds like something that's best addressed by Infra (on the component) or the test owner (on Cc).
Components: -Infra>Labs
I don't think this is labs either... it sounds like one of the builders isn't properly cleaning up after it runs.

Comment 15 by vabr@chromium.org, Jul 27 2016

There is a similar issue with ui_arc_unittests ( bug 631932 ).

Comment 16 by vabr@chromium.org, Jul 27 2016

 Issue 631932  has been merged into this issue.
Owner: ----
Status: Untriaged (was: Assigned)
Someone needs to look at these tests and make sure they are properly cleaning up, or reboot the slaves after testing.
Components: Infra
"Someone needs to look at these tests and make sure they are properly cleaning up, or reboot the slaves after testing."

It's not obvious which test if not properly cleaning up.  See comment #12.

I suggest the infra team revise their scripts per the error message (displayed in comment #1) to remove /tmp/.X9-lock and launch Xvfb again if the initial launching failed.  Adding Infra tag again for this reason.

If this doesn't solve the issue, rebooting the slaves each time is a (heavy-handed) option.
Cc: -tandrii@chromium.org
Chrome Infra doesn't own the test launcher scripts, as far as I am aware.
tandrii@, can you please determine who does and assign them this bug?  I have no idea even where to start looking.
Cc: tandrii@chromium.org
Components: -Infra
Adding tandrii@ to answer the question.

Removing infra component.
Cc: csharp@chromium.org petermayo@chromium.org phajdan.jr@chromium.org
I have no idea who is owner, but legacy/scripts/slave/runtest.py has been last touched (moved) by phajdan.jr@ so adding him. Then blame suggests petermayo@ and csharp@ who worked on this back in 2012.
Cc: -tandrii@chromium.org
Labels: hotlist-infra-opportunity
Cc: -phajdan.jr@chromium.org phajdan@google.com
Components: -Tests>Flaky Infra>Client>Chrome
Owner: phajdan.jr@chromium.org
Status: Started (was: Untriaged)
I uploaded https://codereview.chromium.org/2216543002 . Also see related issue 628517 .
Project Member

Comment 26 by bugdroid1@chromium.org, Aug 4 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/5916a6410762e178bee821135bc5e1059c77cab9

commit 5916a6410762e178bee821135bc5e1059c77cab9
Author: phajdan.jr <phajdan.jr@chromium.org>
Date: Thu Aug 04 12:47:49 2016

xvfb: improve robustness when stale lock files exist

This ports https://codereview.chromium.org/2153083002
to src-side xvfb.py .

BUG= 630698 , 628517

Review-Url: https://codereview.chromium.org/2216543002
Cr-Commit-Position: refs/heads/master@{#409767}

[modify] https://crrev.com/5916a6410762e178bee821135bc5e1059c77cab9/infra/scripts/legacy/scripts/slave/xvfb.py

Status: Fixed (was: Started)
Pawel mentioned this is fixed in a meeting.

Sign in to add a comment