CQ not picking up updated cq.cfg from WebRTC repo? |
||||
Issue descriptionHi, last week I renamed a couple of trybots and phajdan.jr@ was kind to restart the CQ after I noticed it wasn't picking up the updated config (https://chromium.googlesource.com/external/webrtc/+/master/infra/config/cq.cfg). That time things seems to have solved themselves after a while - I didn't pay attention to exactly when etc since I didn't want to load our try server too much. Today I renamed a few more bots in https://codereview.chromium.org/2289423002/ followed by master restarts and a CQ config CL: https://codereview.webrtc.org/2293423002/ I asked Pawel to restart the CQ again and he did. But even after that, CLs like https://codereview.webrtc.org/2283743003/ still tries to trigger the no-longer-existing trybots: ios64_gn_rel and ios64_gn_dbg Is this a CQ bug?
,
Aug 31 2016
OK, so CQ was restarted ~12:38 PM MUC time, and I can see it in log. Config change landed ~12:33 PM MUC time. Luci-config has ~10 minutes polling frequency. CQ log says right after restart (note MTV time): [I2016-08-31T03:38:12.553630-07:00 18306 140671463126848 projects:249] Config revision: cf157cf655ec96c4bc6e8105301ae032d23a6543 which corresponds to https://chromium.googlesource.com/external/webrtc/+/cf157cf655ec96c4bc6e8105301ae032d23a6543 dating to Mon Aug 29. So, pawel restarted CQ too early. Finally, after Monday outage, Dave disabled auto_deploy and apparently it wasn't turned back on. In any case, I'm manually restarting webrtc CQ now. So, I'm restarting CQ right now for you manually,
,
Aug 31 2016
After restart, this is in log: [I2016-08-31T08:27:38.680762-07:00 12242 140262567593792 projects:249] Config revision: 20e47a2a94ce519ed892f8e5ca8021437c16c4ed
,
Aug 31 2016
Thanks for fixing and clarifying how it works! What should be the normal procedure for when we've updated cq.cfg? Should it normally be picked up automatically within 10 minutes? Or should we document instructions about waiting 10 minutes and then find an infra person that can issue a restart?
,
Aug 31 2016
,
Sep 1 2016
Re #4: under normal circumstances (read: most of the time during last 12 month): * luci-config takes at most 10 minutes to process new commit. * CQ is restarted every 30 minutes => at most 40 minutes Dave has been working on actually restarting CQ immediately if new config is available (and only then). This will take some time, and we'll surely send happy PSA at the time. Till then, 40 minutes is your best guide. [1] [1] Well, given that CQ is restarted every by cron on */30, if you land your change by say 12:45, it's reasonable to expect it live by 13:00.
,
Sep 2 2016
Thanks for the additional details, this really helps customers understanding how it works. I added some details to our own CQ wiki page for the future, but I'm wondering if important facts like these are covered somewhere at go/chrome-infra-docs? I could only find https://chrome-internal.googlesource.com/infra/infra_internal/+/master/doc/troopers/playbook.md#Commit-Queue-issues but it's not really the right spot to put it. https://chrome-internal.googlesource.com/infra/infra_internal/+/master/doc/systems/index.md at the internal docs doesn't list CQ and finally the https://cr-doc.appspot.com/ search page doesn't find anything... |
||||
►
Sign in to add a comment |
||||
Comment 1 by tandrii@chromium.org
, Aug 31 2016