New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 645280 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner: ----
Closed: Nov 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 3
Type: Bug

Blocked on:
issue 645279
issue 647061

Blocking:
issue 644669



Sign in to add a comment

Task took Swarming bot offline

Project Member Reported by kbr@chromium.org, Sep 8 2016

Issue description

The following task:
https://chromium-swarm.appspot.com/user/task/3120d46edf385910

took build545-m4 offline:
https://chromium-swarm.appspot.com/restricted/bot/build545-m4

Strangely, the task ID that the bot reports as its last one is different than this one (the low bit is set; not sure whether that means something in Swarming), though it appears to be the same task:
https://chromium-swarm.appspot.com/user/task/3120d46edf385911

This was from this CL:
https://codereview.chromium.org/2320023002

I filed Issue 645279 about bringing up this and some other dead bots. I'm not sure why the WindowServer is being killed by these jobs, but we need to make Swarming more resilient to this failure mode.

 

Comment 1 by kbr@chromium.org, Sep 14 2016

Blockedon: 647061

Comment 2 by kbr@chromium.org, Sep 16 2016

Blockedon: 644669
Cc: vadimsh@chromium.org mar...@chromium.org cwallez@chromium.org
Note: d4d44f9ccbce0cd089a3066c438952863921cd40 is intended to work around a graphics driver bug that was affecting this test. It just landed today, so one of the most recent instances of this problem:
https://chromium-swarm.appspot.com/user/task/3146977600a55010

happened before it landed.

Let's continue to watch chromium_try_flakes for this problem ( Issue 619264 ) and see if it's resolved:
https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyMAsSBUZsYWtlIiV3ZWJnbDJfY29uZm9ybWFuY2VfdGVzdHMgKHdpdGggcGF0Y2gpDA

Swarming should be resilient to this failure mode though, and reboot the machine if it happens.

Comment 3 by mar...@chromium.org, Sep 23 2016

Cc: -mar...@chromium.org
Labels: -Restrict-View-Google
Owner: mar...@chromium.org
Status: Assigned (was: Untriaged)
-RVG, there's nothing internal.

Will take another look at this specifically on the Swarming side.

Comment 4 by enne@chromium.org, Nov 8 2016

 Issue 619264  appears to continue to be flaking, but should this issue be closed at this point?

Comment 5 by kbr@chromium.org, Nov 9 2016

Labels: -Pri-1 Pri-2
We can downgrade this to P2 (or lower) at this point, but no changes went into the Swarming code in response yet, and I think some should in order to make it more robust.

Comment 6 by mar...@chromium.org, Oct 30 2017

Labels: -Pri-2 Pri-3
Owner: ----
Status: Available (was: Assigned)
Some general improvement went into the bot; but I'm not familiar enough with OSX to know how to detect this state.

As long as we don't know what killed the bot, I can't determine what to do to detect the error state.

Comment 7 by mar...@chromium.org, Nov 21 2017

Cc: kbr@chromium.org
Is this still a problem worth investigating?

Comment 8 by kbr@chromium.org, Nov 21 2017

Blockedon: -644669
Blocking: 644669
Status: WontFix (was: Available)
Haven't seen this in a while, so closing as WontFix, but going forward we should jump on diagnosing serious problems like this one.

Sign in to add a comment