New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 642754 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner: ----
Closed: Sep 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

CQ not picking up updated cq.cfg from WebRTC repo?

Project Member Reported by kjellander@chromium.org, Aug 31 2016

Issue description

Hi, last week I renamed a couple of trybots and phajdan.jr@ was kind to restart the CQ after I noticed it wasn't picking up the updated config (https://chromium.googlesource.com/external/webrtc/+/master/infra/config/cq.cfg). That time things seems to have solved themselves after a while - I didn't pay attention to exactly when etc since I didn't want to load our try server too much.

Today I renamed a few more bots in https://codereview.chromium.org/2289423002/ followed by master restarts and a CQ config CL: https://codereview.webrtc.org/2293423002/
I asked Pawel to restart the CQ again and he did.

But even after that, CLs like  https://codereview.webrtc.org/2283743003/ still tries to trigger the no-longer-existing trybots: ios64_gn_rel and ios64_gn_dbg 

Is this a CQ bug? 
 
it could be due to the use of stale luci cache on CQ host.
Cc: dsansome@chromium.org
OK, so  CQ was restarted ~12:38 PM MUC time, and I can see it in log. Config change landed ~12:33 PM MUC time. Luci-config has ~10 minutes polling frequency. 
CQ log says right after restart (note MTV time):

[I2016-08-31T03:38:12.553630-07:00 18306 140671463126848 projects:249] Config revision: cf157cf655ec96c4bc6e8105301ae032d23a6543

which corresponds to https://chromium.googlesource.com/external/webrtc/+/cf157cf655ec96c4bc6e8105301ae032d23a6543 dating to Mon Aug 29.
So, pawel restarted CQ too early.

Finally, after Monday outage, Dave disabled auto_deploy and apparently it wasn't turned back on.

In any case, I'm manually restarting webrtc CQ now.

So, I'm restarting CQ right now for you manually,
After restart, this is in log:

[I2016-08-31T08:27:38.680762-07:00 12242 140262567593792 projects:249] Config revision: 20e47a2a94ce519ed892f8e5ca8021437c16c4ed
Thanks for fixing and clarifying how it works!

What should be the normal procedure for when we've updated cq.cfg? Should it normally be picked up automatically within 10 minutes?

Or should we document instructions about waiting 10 minutes and then find an infra person that can issue a restart?

Comment 5 by estaab@chromium.org, Aug 31 2016

Cc: estaab@chromium.org
Re #4: under normal circumstances (read: most of the time during last 12 month):
* luci-config takes at most 10 minutes to process new commit.
* CQ is restarted every 30 minutes
=> at most 40 minutes 

Dave has been working on actually restarting CQ immediately if new config is available (and only then). This will take some time, and we'll surely send happy PSA at the time. Till then, 40 minutes is your best guide. [1]


[1] Well, given that CQ is restarted every by cron on */30, if you land your change by say 12:45, it's reasonable to expect it live by 13:00.
Cc: machenb...@chromium.org
Status: WontFix (was: Untriaged)
Thanks for the additional details, this really helps customers understanding how it works. 
I added some details to our own CQ wiki page for the future, but I'm wondering if important facts like these are covered somewhere at go/chrome-infra-docs? 
I could only find https://chrome-internal.googlesource.com/infra/infra_internal/+/master/doc/troopers/playbook.md#Commit-Queue-issues but it's not really the right spot to put it.

https://chrome-internal.googlesource.com/infra/infra_internal/+/master/doc/systems/index.md at the internal docs doesn't list CQ and finally the https://cr-doc.appspot.com/ search page doesn't find anything... 

Sign in to add a comment