New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 654885 link

Starred by 0 users

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Apr 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: 2
Type: Bug



Sign in to add a comment

"infra go steps" has flakily closed the infra tree 27 times in the last 40 days

Project Member Reported by aga...@chromium.org, Oct 11 2016

Issue description

http://infra-status.appspot.com/?limit=80

"infra go steps" has been the cause of tree closure 27 times between September 1st and today (October 11th). The failure is almost always a single test in infra/monitoring/proxy.

"""
--------------------------------------------------------------------------------
.................................................F..........

--------------------------------------------------------------------------------
PACKAGE: infra/monitoring/proxy
--------------------------------------------------------------------------------
.....
5 total assertions

......x
Failures:

  * /mnt/data/b/rr/tmpFMxIEP/w/infra/go/src/infra/monitoring/proxy/logging_test.go 
  Line 212:
  Expected: '50'
  Actual:   '8'
  (Should be equal)


12 total assertions

--- FAIL: TestCloudLogging (0.00s)
S
12 total assertions (one or more sections skipped)

......................
34 total assertions (one or more sections skipped)

.............
47 total assertions (one or more sections skipped)

FAIL
coverage: 42.3% of statements
FAIL	infra/monitoring/proxy	0.074s
"""


Assigning to dsansome, who has both the most CLs in that directory, and the two most recent CLs in that directory. Feel free to reassign if someone else is more appropriate.
 

Comment 1 by stip@chromium.org, Oct 11 2016

Components: -Infra Infra>Flakiness
Cc: d...@chromium.org sergeybe...@chromium.org vadimsh@chromium.org
I can't seem to reproduce this failure locally (running all the tests continuously for an hour).

This code seems to be a cloud logging client inside the monitoring proxy.  I wonder if we should just remove it and use cloudtail instead.

Comment 3 by d...@chromium.org, Oct 12 2016

There is also github.com/luci/luci-go/common/logging/cloudlog

This is the same thing as the cloud logging client in monitoring/proxy. It was, in fact, written specifically to be used in monitoring/proxy. sergeyberezin@ asked me to refrain from making the transition because he wanted to use it as a learning point, but I don't think he has had time to do this yet. I recommend just swapping them out. It should be drop-in.
Ah thanks, yeah I'll use that instead.

luci-go/common/cloudlogging looks very similar to infra/tools/cloudtail - are there plans to merge those as well?

Comment 5 by d...@chromium.org, Oct 12 2016

vadimsh@ and I talked a lot about that. cloudtail was developed after monitoring/proxy but before cloudlog. We talked about migrating cloudtail over to the cloudlog packages but stopped short b/c cloudtail was working just fine.

I think the forced switch away from v1beta1 APIs will give us a good reason to switch cloudtail over to the LUCI cloud logging packages ( https://bugs.chromium.org/p/chromium/issues/detail?id=654505 ). AFAIK it has all of the functionality cloudtail needs (at least, as of the time it was written).
LUCI package doesn't do at least the following things:
1. Dropping logs if there's a congestion and all buffers are full (cloudtail learned to do it in the aftermath of one nasty outage).
2. Timeouts when terminating and flushing. All cloudtail guts are context-aware and support cancelation.

In the aftermath of the mentioned outage I've also spent some significant time making sure cloudtail behaves adequately under various bad conditions (by writing high level tests that emulate these conditions, among other things).

So I at least believe that it's better to base LUCI cloudlog on cloudtail, than the other way around :P

Comment 7 by d...@chromium.org, Oct 12 2016

Yeah I suppose I wasn't suggesting that the code as-is be used, but rather the package layout. If that means porting cloudtail's logic into the packages, so be it.
Status: Fixed (was: Assigned)
This code was deleted by https://chromium-review.googlesource.com/462895

Sign in to add a comment