|Increase the builder's GoB quotas to better accomodate current load|
|Project Member Reported by diand...@chromium.org, Jan 10||Back to list|
Basically the same thing happened as was described in bug #769088. This paladin: https://luci-milo.appspot.com/buildbot/chromeos/fizz-paladin/3054 ...failed in CommitQueueSync. In the text of the error you see much of: --- 09:52:42: WARNING: git reported transient error (cmd=fetch -f https://chrome-internal-review.googlesource.com/chromeos/overlays/overlay-fizz-private refs/changes/39/540139/2); retrying Traceback (most recent call last): File "/b/c/cbuild/repository/chromite/lib/retry_util.py", line 177, in _Wrapper ret = func(*args, **kwargs) File "/b/c/cbuild/repository/chromite/lib/retry_util.py", line 243, in _run return functor(*args, **kwargs) File "/b/c/cbuild/repository/chromite/lib/cros_build_lib.py", line 654, in RunCommand raise RunCommandError(msg, cmd_result) RunCommandError: return code: 128; command: git fetch -f https://chrome-internal-review.googlesource.com/chromeos/overlays/overlay-fizz-private refs/changes/39/540139/2 fatal: remote error: Short term ls-remote-gerrit rate limit exceeded for <redacted> fatal: The remote end hung up unexpectedly [W git.go:283] Transient error string identified in STDERR: "fatal: The remote end hung up unexpectedly\n" [W git.go:294] Retrying after 3s (rc=128): transient error string encountered --- The previous bug was closed as WontFix since the problem didn't reproduce.
...actually at one other paladin this too. I'll see if I find any more as well... https://luci-milo.appspot.com/buildbot/chromeos/peach_pit-paladin/18175
Looking at the logs on the master paladin, it seems that the master first tried fizz-paladin/3054, observed the failure, and restarted/retried with fizz-paladin/3055. That second run passed. The theory from dgarrett@ is that since adding new GCE builders our some months ago, our usage has increase enough that we're brushing up against our GoB quota limits. In this particular case "brushing up against" became "outright exceeded". Since the quota we hit was short term, retry works in that case. Assuming the theory is correct, the correct response will be to request more quota.
Jan 16 (6 days ago),
This didn't get done last week. We need to find time to move it forward.
Today (9 hours ago),
-> don to annotate this bug with the metric that shows the problem, then rediscuss in next meeting
Today (8 hours ago),
This graph shows our usage over the last year. https://viceroy.corp.google.com/chromeos/gerrit?duration=38707218
Today (8 hours ago),
|► Sign in to add a comment|