New issue
Advanced search Search tips

Issue 687236 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Jan 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

datastore_v3: TIMEOUT when retrieving many builds

Project Member Reported by eakuefner@chromium.org, Jan 31 2017

Issue description

e.g. for https://luci-milo.appspot.com/buildbot/chromium.perf/Win%207%20x64%20Perf/?limit=100 I consistently see

Error: 500

API error 5 (datastore_v3: TIMEOUT): The datastore operation timed out, or the data was temporarily unavailable.
Request ID: 5890cd5e00ff0d3ffb7e1e60b80001737e6c7563692d6d696c6f0001313430372d31386161383434000100
 

Comment 1 by estaab@chromium.org, Jan 31 2017

Cc: d...@chromium.org hinoka@chromium.org
Labels: -Pri-2 Pri-1
Owner: hinoka@chromium.org
Status: Assigned (was: Untriaged)
Ryan and Dan, can you see if you can fix this? Maybe we're fetching too much data at once?

Comment 3 by estaab@chromium.org, Jan 31 2017

Ok. Do we have to unmarshall everything? 10 seconds seems long for rendering this page.

Comment 4 by hinoka@chromium.org, Jan 31 2017

Because of the way the build struct was designed, it has to be unmarshalled on load :(.  In this case it is loading -> decompress -> base64 decode -> json unmarshal 100 items.

I do want to refactor it to do something more sensible (split summary and detail) but it will be a little tricky/risky.  I think this will be a blocker for console view so it'll probably also happen sooner rather than later.

Comment 5 by d...@chromium.org, Jan 31 2017

Owner: d...@chromium.org
Status: Started (was: Assigned)

Comment 6 by d...@chromium.org, Jan 31 2017

Propose using a Batcher: https://codereview.chromium.org/2668763002
Project Member

Comment 7 by bugdroid1@chromium.org, Jan 31 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/external/github.com/luci/luci-go.git/+/b7c33500af911388323e6e612e933ea0f7f57888

commit b7c33500af911388323e6e612e933ea0f7f57888
Author: dnj <dnj@chromium.org>
Date: Tue Jan 31 19:27:53 2017

Use a datastore batcher for build queries.

Queries are timing out. This is because processing elements is
CPU-intensive, and datastore queries have a maximum lifetime of 30
seconds. Use batching to break the single query/deserialize process into
a series of consecutive queries so any individual query doesn't run into
the timeout limit.

Note that really high limits will still bump into the actual AppEngine
request timeout.

BUG= chromium:687236 
TEST=None
R=estaab@chromium.org, hinoka@chromium.org

Review-Url: https://codereview.chromium.org/2668763002

[modify] https://crrev.com/b7c33500af911388323e6e612e933ea0f7f57888/milo/appengine/buildbot/builder.go
[modify] https://crrev.com/b7c33500af911388323e6e612e933ea0f7f57888/milo/appengine/buildbot/console.go
[add] https://crrev.com/b7c33500af911388323e6e612e933ea0f7f57888/milo/appengine/buildbot/datastore.go
[modify] https://crrev.com/b7c33500af911388323e6e612e933ea0f7f57888/milo/appengine/buildbot/grpc.go
[modify] https://crrev.com/b7c33500af911388323e6e612e933ea0f7f57888/milo/appengine/buildbot/master.go
[modify] https://crrev.com/b7c33500af911388323e6e612e933ea0f7f57888/milo/appengine/buildbot/pubsub.go

Comment 8 by hinoka@chromium.org, Jan 31 2017

Status: Fixed (was: Started)
This works now but it's hella slow.  Marking as fixed for now and will open another bug for the speed issue.

Comment 9 by d...@chromium.org, Jan 31 2017

Cool! Yeah nothing I did speeds anything up at all :P
Nice! Thanks Ryan for the quick debugging and Dan for the fix!

Sign in to add a comment