Issue metadata
Sign in to add a comment
|
Trybots sometimes not updating on Gerrit issue |
||||||||||||||||||||||||
Issue descriptionSeeing that once in a while, now captured it and logged a bunch of debugging data. CL where this was observed: https://chromium-review.googlesource.com/c/452484 Screenshot showing the problem (the trybots are all from patchset 1, but are shown for patchset 2, which had no tryjobs at the time): https://screenshot.googleplex.com/cto8P9XvBJD.png Screenshot after duplicating the tab (looked good immediately): https://screenshot.googleplex.com/EAQCo3R9nBU.png The attached screenshots show the network activity on the first tab with the problem. All times in CET. - First screenshot shows there is no buildbucket polling going on - Second screenshot (3 minutes later) shows buildbucket polling started again out of the blue, but returns 401s. The 401 response was a buildbucket error claiming that an authorization token was required. The request header did specify the token. Was staring at it, but unfortunately the data got lost before I was able to copy it.
,
Mar 13 2017
Ran into it again. Same CL. Just like in Andrii's theory, I uploaded a third patchset, was impatient, clicked onto the top-left-corner issue link to update the tab, manually selected the first patchset again (the one that should have tryjobs), and saw no tryjobs. Again, no polling, and after a bit, saw 401s from buildbucket. Here the headers (bearer token removed) and response: https://paste.googleplex.com/6207088699113472
,
Mar 14 2017
buildbucket logs: https://screenshot.googleplex.com/wFE4Azxk5E3 This looks like an issue in Gerrit-Buildbucket integration
,
Mar 22 2017
Aaron, once you are back, can you decide what to do with this?
,
Mar 23 2017
I have no idea what to do with this. I can't reproduce it myself, and I've tried. And I have no earthly clue why refreshing the page at the wrong time would cause 401s. I'm well and truly lost. Andy, you did all the buildbucket plugin auth stuff, can you take a look?
,
Mar 24 2017
I have a reliable repro of at least one scenario: - Take any CL that has trybots on the latest patchset, navigate to it -> You can verify in the developer console's network tab that buildbucket is polled regularly and returns 200s. - Wait one hour (guess until auth token expires) -> buildbucket starts returning 401s and polling gets more and more delayed - Optionally upload a new patchset (that contains changes) and trigger some tryjobs on cmd line, e.g. "git cl try -m tryserver.v8 -b boom" to distinguish with existing tryjobs - On the CL, navigate by clicking on the CL number in the top left corner (don't press refresh) -> Now the latest patchset can be seen, but still the old tryjobs -> Buildbucket polling still returns 401s. -> If you refresh, everything gets back to normal I think instead of clicking on the CL number, there are several ways to navigate and see this bug, e.g. clicking on My->Changes and selecting a different change, but that's a guess...
,
Mar 24 2017
bug: the if condition in https://chromium.googlesource.com/infra/gerrit-plugins/buildbucket/+/9e52ba5/src/main/resources/static/cr-buildbucket-view.js#322 does not check whether token expired
,
Mar 29 2017
|
|||||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||||
Comment 1 by tandrii@chromium.org
, Mar 13 2017So, after more discussion with Michael, we think there are two bugs: ### First, the most important one If tryjobs for a given patchset have not been fetched yet, the UI should say that and not show potentially stale results. The buildbucket polling seems ongoing but with weird delays between calls (for 3 minutes no call, then 2 calls withing 1 minute), but for whatever reason results from prior patchset are still shown. Also note that the timestamp listed on buildbucket results say 1:41PM, while timestamp for the change ("Updated" left top corner) shows 1:46 PM (which is when patchset #2 was uploaded). Now, I am very certain I have witnessed this before, also in similar conditions to Michael case here: 0. send patcshet #1 for review. 1. get email from reviewer, open reviewer's comment on patchset #1 2. fix it in my terminal 3. run git cl upload, which takes some time. 4. expect patchset #2 to be uploaded very soon, go to browser and navigate to change's main view (ie like https://chromium-review.googlesource.com/c/452484). As 3 is still running while I'm doing 4, there are potential races, even more likely with behind the scenes data stores replicating git data asynchronously. ### Second Buildbucket replied with 401 and complained about auth. Maybe auth token was close to expiry when first fetched from Gerrit backend and on later calls it buildbucket couldn't accept it? Or maybe rare buildbucket bug? +nodir@ we have timestamp of bad calls (~1:55 PM UTC+1) -> can you check buildbucket logs and provide any more insight into what was actually wrong with the request auth?