New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 817976 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: May 2018
Cc:
Components:
EstimatedDays: ----
NextAction: 2018-05-08
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

Swarming: correctly report maintenance state

Project Member Reported by mar...@chromium.org, Mar 1 2018

Issue description

As part of issue 416072, state: maintenance was added and is now reported in the monitoring pipeline.

The problem is that this state is not:
- Stored in the DB in BotInfo
- Queriable in the bots/count or bots/list APIs
- Categorized in the Web UI

That's a huge usability problem, as bots "look idle" but they are not.

AIs:
- DB: update BotInfo.composite to have a new bit for MAINTENANCE = 1<<6
- API: update BotsRequest, BotsCount and BotInfo to have maintenance
- Web UI: add selectors and counts
and all the support code, including updating client/swarming.py


BotInfo: https://cs.chromium.org/chromium/infra/luci/appengine/swarming/server/bot_management.py?q=%22class+BotInfo(_%22
BotsCount:
https://cs.chromium.org/chromium/infra/luci/appengine/swarming/swarming_rpcs.py?q=%22class+BotsCount(%22

 
Project Member

Comment 1 by bugdroid1@chromium.org, Apr 4 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/c14663096bbb106dd7f0a6ae65cb5439afb45b68

commit c14663096bbb106dd7f0a6ae65cb5439afb45b68
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Wed Apr 04 20:48:51 2018

[swarming] Redo bot_management_test.py in preparation of refactor

Rename BotInfo.composite constants in preparation to add more.

Cleanup the tests a bit in preparation as more states are being added.

Bug:  826421 , 817976 
Change-Id: I84d4397713f52a5fec4dd392ce5f1f0f7747c70e
Reviewed-on: https://chromium-review.googlesource.com/993034
Reviewed-by: Robbie Iannucci <iannucci@chromium.org>
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/c14663096bbb106dd7f0a6ae65cb5439afb45b68/appengine/swarming/server/bot_management.py
[modify] https://crrev.com/c14663096bbb106dd7f0a6ae65cb5439afb45b68/appengine/swarming/server/bot_management_test.py

Cc: -charliea@chromium.org nednguyen@chromium.org
Owner: charliea@chromium.org
I'll assume ownership of the remaining work on this bug.
Status: Assigned (was: Available)
Project Member

Comment 4 by bugdroid1@chromium.org, May 3 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/55121137238e138f321a6d8ce6f0614ee5d4ef93

commit 55121137238e138f321a6d8ce6f0614ee5d4ef93
Author: Charlie Andrews <charliea@chromium.org>
Date: Thu May 03 20:31:28 2018

Plumb maintenance status from the state dict into the database

Work remains to make the database maintenance status filterable through
the API and to surface that filter in the UI.

Bug:  817976 
Change-Id: Ifba962c848185a2b1bb5be24aa2a37190c6b7a69
Reviewed-on: https://chromium-review.googlesource.com/1040953
Commit-Queue: Charlie Andrews <charliea@chromium.org>
Reviewed-by: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/55121137238e138f321a6d8ce6f0614ee5d4ef93/appengine/swarming/handlers_bot.py
[modify] https://crrev.com/55121137238e138f321a6d8ce6f0614ee5d4ef93/appengine/swarming/handlers_bot_test.py
[modify] https://crrev.com/55121137238e138f321a6d8ce6f0614ee5d4ef93/appengine/swarming/handlers_endpoints.py
[modify] https://crrev.com/55121137238e138f321a6d8ce6f0614ee5d4ef93/appengine/swarming/handlers_endpoints_test.py
[modify] https://crrev.com/55121137238e138f321a6d8ce6f0614ee5d4ef93/appengine/swarming/handlers_test.py
[modify] https://crrev.com/55121137238e138f321a6d8ce6f0614ee5d4ef93/appengine/swarming/server/bot_management.py
[modify] https://crrev.com/55121137238e138f321a6d8ce6f0614ee5d4ef93/appengine/swarming/server/bot_management_test.py
[modify] https://crrev.com/55121137238e138f321a6d8ce6f0614ee5d4ef93/appengine/swarming/server/lease_management.py
[modify] https://crrev.com/55121137238e138f321a6d8ce6f0614ee5d4ef93/appengine/swarming/server/lease_management_test.py
[modify] https://crrev.com/55121137238e138f321a6d8ce6f0614ee5d4ef93/appengine/swarming/server/task_queues_test.py
[modify] https://crrev.com/55121137238e138f321a6d8ce6f0614ee5d4ef93/appengine/swarming/server/task_scheduler_test.py
[modify] https://crrev.com/55121137238e138f321a6d8ce6f0614ee5d4ef93/appengine/swarming/server/task_to_run_test.py
[modify] https://crrev.com/55121137238e138f321a6d8ce6f0614ee5d4ef93/appengine/swarming/swarming_rpcs.py

Project Member

Comment 6 by bugdroid1@chromium.org, May 4 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/774054c8c6e44c20e035c5b72b1265ddbc284354

commit 774054c8c6e44c20e035c5b72b1265ddbc284354
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Fri May 04 19:42:49 2018

[swarming] Fix regression in dee907da04d5

There was a unit test, but because ndb.Model ignores invalid property with a
None value, the unit test didn't catch it.

Tweaked the unit test to reproduce the bug by specifying a non-None value. Grr.

Bug:  817976 
Change-Id: Ief295dbf7f180ef5093cba7d56a3ce31f734333f
Reviewed-on: https://chromium-review.googlesource.com/1044889
Reviewed-by: Charlie Andrews <charliea@chromium.org>
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/774054c8c6e44c20e035c5b72b1265ddbc284354/appengine/swarming/handlers_endpoints.py
[modify] https://crrev.com/774054c8c6e44c20e035c5b72b1265ddbc284354/appengine/swarming/handlers_endpoints_test.py

Last CL that surfaces the status on the bot list page is in the CQ now. maruel@ and I agreed that we'd wait until this is in production, make sure everything works as expected, and then close this bug as long as it does.
NextAction: 2018-05-08
Marking the next action date as tomorrow, as that's when maruel@ expects this will make it into prod for us to check how it's working.
The NextAction date has arrived: 2018-05-08
Status: Fixed (was: Assigned)
Pushed to prod, looks good!

Sign in to add a comment