New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 781281 link

Starred by 2 users

Issue metadata

Status: Duplicate
Merged: issue 783349
Owner:
Last visit > 30 days ago
Closed: Nov 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug-Regression



Sign in to add a comment

bot affinity 'build127-b1' is broken on chromium.perf/Mac Air 10.11 Perf, affecting 32 tests

Project Member Reported by eyaich@chromium.org, Nov 3 2017

Issue description

bot affinity 'build127-b1' is broken on chromium.perf/Mac Air 10.11 Perf, affecting 32 tests

Builders failed on: 
- Mac Air 10.11 Perf: 
  https://build.chromium.org/p/chromium.perf/builders/Mac%20Air%2010.11%20Perf
- Mac Air 10.11 Perf: 
  https://build.chromium.org/p/chromium.perf/builders/Mac%20Air%2010.11%20Perf



 
Components: Infra>Labs
Labs, would you please reboot build127-b1? It never generated output which is what is causing the steps to error.
Owner: friedman@chromium.org
Status: Assigned (was: Available)
Rebooted.
Project Member

Comment 4 by bugdroid1@chromium.org, Nov 3 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/83b5f05cd6bf6a5f449182125a393ffd2e03493d

commit 83b5f05cd6bf6a5f449182125a393ffd2e03493d
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Fri Nov 03 18:40:15 2017

swarming.py: fix regression in adaf484266, improve decorated ouput

The client crashes on task without output, e.g. expired.

Changes the decorated output in case of failures to be informative.
Tailors based on the kind of task result. For example for an EXPIRED task,
it looks like:

+------------------------------------------------------------------------+
| End of shard 0                                                         |
|  Pending: 36052.1s  EXPIRED (lack of capacity)                         |
+------------------------------------------------------------------------+

In particular, call out "lack of capacity" to make this extra clear to
users.

For a BOT_DIED;

+------------------------------------------------------------------------+
| End of shard 0                                                         |
|  Pending: 0.4s  Duration: N/A  Bot: vm447-m4  Exit: N/A  BOT_DIED      |
+------------------------------------------------------------------------+

etc.

R=jchinlee@chromium.org
Bug:  781281 
Change-Id: Ie813ef3ce895880223322b4cfb80236d800666b3
Reviewed-on: https://chromium-review.googlesource.com/753846
Reviewed-by: Jao-ke Chin-Lee <jchinlee@chromium.org>
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/83b5f05cd6bf6a5f449182125a393ffd2e03493d/client/swarming.py

Status: Fixed (was: Assigned)
I assume this is done?
Owner: ----
Status: Available (was: Fixed)
Thanks Elliott for rebooting the bot! Seems like rebooting the bot didn't address the exceptions, however... I'm not sure the root cause, marking as available so the current trooper can take a look.
Components: -Infra>Labs Infra>Client>Perf

Comment 8 by no...@chromium.org, Nov 9 2017

i, trooper, cannot ssh to the bot, so i cannot analyze swarming bot logs to determine why the bot died. Labs, please reboot again.

FTR here are two most recent tasks that killed the bot
https://chromium-swarm.appspot.com/task?id=39aca009fe7f2010
https://chromium-swarm.appspot.com/task?id=39b2c99d6bcbbc10

Comment 9 by no...@chromium.org, Nov 9 2017

Components: Infra>Labs
Owner: pschmidt@chromium.org
Status: Assigned (was: Available)
The slave has been locking up very regularly over the last few weeks.

I'm going to re-image it.
It's back up and in swarming.
Mergedinto: 783349
Status: Duplicate (was: Assigned)

Sign in to add a comment