New issue
Advanced search Search tips

Issue 920852 link

Starred by 1 user

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

Log stream has no annotation entries

Project Member Reported by jdarpinian@chromium.org, Jan 11

Issue description

Labels: -Pri-3 Pri-1
I have good news and bad news

Bad news: We probably lost the annotation stream.
Good news: We have the build information in buildbucket
You can access the build via the not-yet-completed buildbucket page:
https://ci.chromium.org/p/chromium/builders/try/linux_angle_rel_ng/4689

It looks like this isn't as widespread as "We lost everything", since I can still load a lot of the builds, just a lot of the other builds aren't loadable.  This graph confirms we lost builds, but not a catastrophic number of them:
https://screenshot.googleplex.com/WGaBJpSPEoq

Obviously we don't want to lose any builds, but plugging up this pipeline is an ongoing project.
Thanks. Those logs were accessible sometime in the past, because they were linked from old bugs. So they were lost recently?

I mostly care about the build logs, so that buildbucket page is fine for me. Looks like you just deleted "luci.chromium" from my first URL, but that doesn't work for the second one. How can I fix up my second URL?
Unfortunately the second one pre-dated buildbucket storing full information about a build.

Comment 4 by hinoka@chromium.org, Jan 18 (4 days ago)

Owner: hinoka@chromium.org
Status: Assigned (was: Untriaged)
Root cause found, it was due to tumble backlog, which causes
Tumble started to have a backlog starting early december due to GAE running out of quota (used up $5000/day), and reducing tumble shard count to mitigate that.
This drops the tumble processing rate below the threshold required.

Tumble queue size: https://screenshot.googleplex.com/DqiOLvYCZRQ
Tumble processing rate: https://screenshot.googleplex.com/jxPRUNNmKzh

Tumble shard count was increased, which caused LogDog to run out of quota again last night.  This isn't really a sustainable strategy.

This class of failure should go away with the work tracked in 923557

Comment 5 by jdarpinian@chromium.org, Jan 18 (4 days ago)

Thanks for investigating!

Sign in to add a comment