"infra go steps" has flakily closed the infra tree 27 times in the last 40 days |
|||
Issue descriptionhttp://infra-status.appspot.com/?limit=80 "infra go steps" has been the cause of tree closure 27 times between September 1st and today (October 11th). The failure is almost always a single test in infra/monitoring/proxy. """ -------------------------------------------------------------------------------- .................................................F.......... -------------------------------------------------------------------------------- PACKAGE: infra/monitoring/proxy -------------------------------------------------------------------------------- ..... 5 total assertions ......x Failures: * /mnt/data/b/rr/tmpFMxIEP/w/infra/go/src/infra/monitoring/proxy/logging_test.go Line 212: Expected: '50' Actual: '8' (Should be equal) 12 total assertions --- FAIL: TestCloudLogging (0.00s) S 12 total assertions (one or more sections skipped) ...................... 34 total assertions (one or more sections skipped) ............. 47 total assertions (one or more sections skipped) FAIL coverage: 42.3% of statements FAIL infra/monitoring/proxy 0.074s """ Assigning to dsansome, who has both the most CLs in that directory, and the two most recent CLs in that directory. Feel free to reassign if someone else is more appropriate.
,
Oct 12 2016
I can't seem to reproduce this failure locally (running all the tests continuously for an hour). This code seems to be a cloud logging client inside the monitoring proxy. I wonder if we should just remove it and use cloudtail instead.
,
Oct 12 2016
There is also github.com/luci/luci-go/common/logging/cloudlog This is the same thing as the cloud logging client in monitoring/proxy. It was, in fact, written specifically to be used in monitoring/proxy. sergeyberezin@ asked me to refrain from making the transition because he wanted to use it as a learning point, but I don't think he has had time to do this yet. I recommend just swapping them out. It should be drop-in.
,
Oct 12 2016
Ah thanks, yeah I'll use that instead. luci-go/common/cloudlogging looks very similar to infra/tools/cloudtail - are there plans to merge those as well?
,
Oct 12 2016
vadimsh@ and I talked a lot about that. cloudtail was developed after monitoring/proxy but before cloudlog. We talked about migrating cloudtail over to the cloudlog packages but stopped short b/c cloudtail was working just fine. I think the forced switch away from v1beta1 APIs will give us a good reason to switch cloudtail over to the LUCI cloud logging packages ( https://bugs.chromium.org/p/chromium/issues/detail?id=654505 ). AFAIK it has all of the functionality cloudtail needs (at least, as of the time it was written).
,
Oct 12 2016
LUCI package doesn't do at least the following things: 1. Dropping logs if there's a congestion and all buffers are full (cloudtail learned to do it in the aftermath of one nasty outage). 2. Timeouts when terminating and flushing. All cloudtail guts are context-aware and support cancelation. In the aftermath of the mentioned outage I've also spent some significant time making sure cloudtail behaves adequately under various bad conditions (by writing high level tests that emulate these conditions, among other things). So I at least believe that it's better to base LUCI cloudlog on cloudtail, than the other way around :P
,
Oct 12 2016
Yeah I suppose I wasn't suggesting that the code as-is be used, but rather the package layout. If that means porting cloudtail's logic into the packages, so be it.
,
Nov 14 2016
The following revision refers to this bug: https://chromium.googlesource.com/infra/infra.git/+/860950029c190b56ce2204dc2dfed9549a40249b commit 860950029c190b56ce2204dc2dfed9549a40249b Author: David Sansome <dsansome@chromium.org> Date: Mon Nov 14 06:52:23 2016 Use LUCI's cloudlog module, delete our own BUG= 654885 Change-Id: Id5427ee2c35ca061d1a76e6bddee75e5fe92bbc1 Reviewed-on: https://chromium-review.googlesource.com/396511 Commit-Queue: Dave Sansome <dsansome@chromium.org> Reviewed-by: Sergey Berezin <sergeyberezin@chromium.org> Reviewed-by: Daniel Jacques <dnj@chromium.org> [delete] https://crrev.com/6c97e51c1e21e7ca795e1ea24de8153216980555/go/src/infra/monitoring/proxy/logging.go [delete] https://crrev.com/6c97e51c1e21e7ca795e1ea24de8153216980555/go/src/infra/monitoring/proxy/logging_test.go [modify] https://crrev.com/860950029c190b56ce2204dc2dfed9549a40249b/go/src/infra/monitoring/proxy/main.go [modify] https://crrev.com/860950029c190b56ce2204dc2dfed9549a40249b/go/src/infra/monitoring/proxy/proxy.infra_testing
,
Apr 6 2017
This code was deleted by https://chromium-review.googlesource.com/462895 |
|||
►
Sign in to add a comment |
|||
Comment 1 by stip@chromium.org
, Oct 11 2016