New issue
Advanced search Search tips

Issue 910095 link

Starred by 1 user

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug

Blocked on:
issue 850113



Sign in to add a comment

Failed to fetch step information from LogDog

Project Member Reported by h...@chromium.org, Nov 29

Issue description

Loading these three tryjob builds recently:
https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux-chromeos-compile-dbg/7714
https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux-chromeos-dbg/558
https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_chromium_chromeos_msan_rel_ng/1067

First of all they took forever to load, like 30 s or more, and then they just show:

---
Failed to fetch step information from LogDog context deadline exceeded
- no logs -
---

Screenshot: https://screenshot.googleplex.com/bex798a1zTV.png

When reloading, it loads successfully.
 
Blockedon: 850113
Components: Infra>Platform>LogDog
Status: Available (was: Untriaged)
There are a couple of possibilities

1. The logdog backend is short on instances (flex instances take longer to spin up)
2. The handler too a long time to process.

I think (1) is more likely than (2), for now I'll increase the number of instances.

Either way, classes of issues where Milo is being slow due to LogDog should be mitigated after 850113, when Milo starts loading builds from Buildbucket instead.
Project Member

Comment 2 by bugdroid1@chromium.org, Dec 6

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-go.git/+/9d3688949727026d533af2086ceb91a558f3789f

commit 9d3688949727026d533af2086ceb91a558f3789f
Author: Ryan Tseng <hinoka@google.com>
Date: Thu Dec 06 18:07:46 2018

[logdog] Increase logs module min instances to 4 (from 2)

A bug described symptoms of not enough instances to serve log endpoints.
Because flex takes a longer time to spin up instances than classic,
it's more important to have idle instances ready to serve.

Bug: 910095
Change-Id: I2452fbfd73866321f63d10d57f74aaae9ce83fcb
Reviewed-on: https://chromium-review.googlesource.com/c/1355799
Commit-Queue: Ryan Tseng <hinoka@chromium.org>
Reviewed-by: Andrii Shyshkalov <tandrii@chromium.org>

[modify] https://crrev.com/9d3688949727026d533af2086ceb91a558f3789f/logdog/appengine/cmd/coordinator/logs/module-logs.yaml

Sign in to add a comment