New issue
Advanced search Search tips

Issue 694913 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Closed: Feb 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

Opening a very recently started logdog-powered build results in HTTP 500

Project Member Reported by vadimsh@chromium.org, Feb 22 2017

Issue description

https://screenshot.googleplex.com/hdeVr41L396.png

One way to reproduce: go to https://luci-scheduler-dev.appspot.com/jobs/infra/infra-continuous-trusty-64, wait until build is about to start ("Next run: now"), click on Status button.
 

Comment 1 by estaab@chromium.org, Feb 22 2017

Labels: -Pri-3 Pri-2

Comment 2 by no...@chromium.org, Feb 23 2017

Cc: d...@chromium.org
Labels: -Pri-2 luci Pri-1

Comment 3 by d...@chromium.org, Feb 23 2017

Yeah Milo needs to propagate the 404.

Comment 4 by no...@chromium.org, Feb 23 2017

milo should not return 404, the build exists. It is normal that logdog annotation stream is not available for some time.

Comment 5 by d...@chromium.org, Feb 23 2017

"some time"? There is an ingest delay between the Butler emitting data and it getting loaded into LogDog. That delay is generally pretty short, but can be longer. If the stream doesn't exist in LogDog, there are two possibilities:
1) It will, but ingest delay. Maybe retry?
2) It will never exist, so retry is wasteful.

We could have Milo respond to 404s by retrying for some threshold (30s?) and then propagating the 404. Is there a better option here?

Comment 6 by no...@chromium.org, Feb 23 2017

Owner: d...@chromium.org
Status: Assigned (was: Untriaged)
Project Member

Comment 7 by bugdroid1@chromium.org, Feb 24 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/external/github.com/luci/luci-go.git/+/887de917e6e7d48a151293fd298d22519c86b6af

commit 887de917e6e7d48a151293fd298d22519c86b6af
Author: dnj <dnj@chromium.org>
Date: Fri Feb 24 03:38:40 2017

Milo: Handle missing / transient LogDog failures.

If a build is labelled as having a LogDog stream, Milo will
unconditionally try and load that stream. This falls apart if the build
has either not started yet, or if the LogDog annotation stream is not
available due to natural ingest delay.

This will prevent stream loading if the build hasn't started yet, and
will retry LogDog transient and not-found errors prior to failing
absolutely.

BUG= chromium:692245 ,  chromium:694913 
TEST=None
R=hinoka@chromium.org, nodir@chromium.org

Review-Url: https://codereview.chromium.org/2717623002

[modify] https://crrev.com/887de917e6e7d48a151293fd298d22519c86b6af/logdog/common/types/streamaddr.go
[add] https://crrev.com/887de917e6e7d48a151293fd298d22519c86b6af/milo/appengine/frontend/expectations/bootstrap-swarming.TestableBuild-build-pending-logdog.html
[add] https://crrev.com/887de917e6e7d48a151293fd298d22519c86b6af/milo/appengine/frontend/expectations/bootstrap-swarming.TestableBuild-build-running-logdog-no-annotation-stream.html
[add] https://crrev.com/887de917e6e7d48a151293fd298d22519c86b6af/milo/appengine/frontend/expectations/buildbot-swarming.TestableBuild-build-pending-logdog.html
[add] https://crrev.com/887de917e6e7d48a151293fd298d22519c86b6af/milo/appengine/frontend/expectations/buildbot-swarming.TestableBuild-build-running-logdog-no-annotation-stream.html
[modify] https://crrev.com/887de917e6e7d48a151293fd298d22519c86b6af/milo/appengine/logdog/build.go
[modify] https://crrev.com/887de917e6e7d48a151293fd298d22519c86b6af/milo/appengine/logdog/http.go
[modify] https://crrev.com/887de917e6e7d48a151293fd298d22519c86b6af/milo/appengine/swarming/build.go
[add] https://crrev.com/887de917e6e7d48a151293fd298d22519c86b6af/milo/appengine/swarming/expectations/build-pending-logdog.json
[add] https://crrev.com/887de917e6e7d48a151293fd298d22519c86b6af/milo/appengine/swarming/expectations/build-running-logdog-no-annotation-stream.json
[modify] https://crrev.com/887de917e6e7d48a151293fd298d22519c86b6af/milo/appengine/swarming/html_data.go
[add] https://crrev.com/887de917e6e7d48a151293fd298d22519c86b6af/milo/appengine/swarming/testdata/build-pending-logdog.swarm
[add] https://crrev.com/887de917e6e7d48a151293fd298d22519c86b6af/milo/appengine/swarming/testdata/build-running-logdog-no-annotation-stream.swarm

Comment 8 by no...@chromium.org, Feb 24 2017

Status: Verified (was: Assigned)

Comment 9 by d...@chromium.org, Feb 24 2017

Did we actually roll out a new Milo with this patch?

Comment 10 by no...@chromium.org, Feb 24 2017

I did for both luci-milo and luci-milo-dev

Comment 11 by d...@chromium.org, Feb 24 2017

A W E S O M E!

Sign in to add a comment