New issue
Advanced search Search tips

Issue 703428 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Jun 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: ----



Sign in to add a comment

LogDog Viewer intra-prefix muxing should infer log contiguousness from contiguous prefix indices.

Project Member Reported by dgarr...@chromium.org, Mar 21 2017

Issue description

After a Golo network issues interrupted builds today (crbug.com/702658), we found a number of builders like this one:

https://uberchromegw.corp.google.com/i/chromeos/builders/daisy_spring-paladin/builds/14720/

Whose links to logs under the topmost 'steps' couldn't be loaded:

https://luci-logdog.appspot.com/v/?s=chromeos%2Fbb%2Fchromeos%2Fdaisy_spring-paladin%2F14720%2F%2B%2F%2A%2A%2Fstdout&s=chromeos%2Fbb%2Fchromeos%2Fdaisy_spring-paladin%2F14720%2F%2B%2F%2A%2A%2Fstderr


It appears that the build interruption generated invalid state inside logdog, which means the information which was uploaded can't be found. The problem is that we need those logs to understand what happened to the builders.
 

Comment 1 by d...@chromium.org, Mar 21 2017

Labels: Pri-1
Summary: LogDog Viewer intra-prefix muxing should infer log contiguousness from contiguous prefix indices. (was: After network issues interrupted builds, logdog contents unviewable.)
OK, so now that I understand this better.

The LogDog viewer, when loading multiple streams, tries to mux them together. In order to mux streams, it has to know what records are available in those streams so that it knows how to render them in order.

The problem here is that the "stderr" stream is still in "streaming" state, but its last emitted record was BEFORE the first record of the next stream. The viewer is in a position: 'do I render a log line from another stream, or do I render the next log line from "stderr"?' It can't actually reconcile this without fetching the next log line from "stderr". "stderr" is not terminated, so there may be another, and "stderr" is also never going to terminate b/c the build crashed, so that other line will never arrive.

There are two types of muxing supported by the viewer: intra-prefix and multi-prefix. Multi-prefix says, "Render these logs, and they are from different Butler instances". In this case, muxing happens by timestamp, and so the next log line for each stream must be known to mux.

However, in "intra-prefix", all log streams come from the same Butler instance, so log lines have a contiguous "prefix index" space. Muxing *should* focus only on that, and can, therefore, make an inference: "If the last prefix index that I rendered was N, and I have an available log with index N+1, that is the next one." It can make this inference even if "stderr" has no more data available.

The solution here is to build that inference into the "infra-prefix" muxing logic of the viewer. I'll try and fix this tomorrow.

Comment 2 by d...@chromium.org, Mar 21 2017

OK, I put a lot of thought into this.

I can implement a partial solution, which is to add a best effort prefix index walk, but this will probably not get too far since the annotation protobuf datagrams eat up some of those prefix indexes, and those will never be part of the glob.

In the situation that *every* text log is part of the glob expression, then we could add an absolute text log ordering field and walk along that, but if any text log stream is missing from the glob this ordering will fail.

Another option that comes to mind is to add a "stop streaming" button that will stop trying to load logs that are incomplete and proceed as if everything was complete. This may be a good idea, but it also might be confusing to users.

I think what we have to do here is resign ourselves to the fact that if log streams are broken, globbing against them is likely to get stuck like this. For incomplete log streams, loading individual logs is the only functional option.
Project Member

Comment 3 by bugdroid1@chromium.org, May 3 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/external/github.com/luci/luci-go.git/+/2944f11ecd9ab8b74ccdde28c5a6bd01a488aa85

commit 2944f11ecd9ab8b74ccdde28c5a6bd01a488aa85
Author: dnj <dnj@chromium.org>
Date: Wed May 03 19:16:07 2017

Enable sequential prefix index logs to be loaded.

The LogDog viewer requires at least one log from each muxing stream to
know which one is sequentially next. For timestamp-based comparison, this is
a hard requirement, since the next log in any given stream could be
sequentially next. However, for prefix index-based comparison, the
viewer could know that a log from another buffer is next if its prefix
index is contiguous.

Enable this inference in the viewer: if a log stream buffer holds a
prefix index that is one more than the last buffered log, allow it to be
considered as the next log, even if other buffers are empty.

BUG= chromium:703428 
TEST=prod

Review-Url: https://codereview.chromium.org/2861583003

[modify] https://crrev.com/2944f11ecd9ab8b74ccdde28c5a6bd01a488aa85/web/inc/logdog-stream-view/viewer.ts

Comment 4 by estaab@chromium.org, Jun 21 2017

Status: Assigned (was: Untriaged)

Comment 5 by d...@chromium.org, Jun 21 2017

Status: Fixed (was: Assigned)
This is actually fixed, so yay! It likely resulted in a slightly less annoyed human on occasion, unbeknownst to said human.

Sign in to add a comment