New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 808193 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Mar 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

LUCI-{GateKeeper,FindIt,SOM} integration does not work for new LUCI builders

Project Member Reported by no...@chromium.org, Feb 1 2018

Issue description

LUCI-{Gatekeeper,FindIt,SOM} integration is currently entirely based on Buildbot API Emulation which currently uses migration app for {(master, builder):bucket} mapping. If a new LUCI builder is created, migration app doesn't see it (because there is nothing to migrate, it is not on Buildbot), so GK/FindIt/SOM do not see it either.

context: Dart cannot use GK to send blame emails (cannot find the bug)
 

Comment 1 by no...@chromium.org, Feb 1 2018

Owner: no...@chromium.org
Status: Assigned (was: Untriaged)
options

1) change migration app to expose new API get_luci_bucket_of(master) and change Milo to use that instead of the current get_builder_info(master, builder), where the response includes luci bucket. Then it does not matter if new builder exists on a master, or even if the master exists. The only thing that matters is that the master is defined in migration app's config. This solves Dart's problem.

2) change API emulation to add a second source of {(master, builder):bucket} mapping, e.g.

# luci-milo.cfg

... consoles, etc...
buildbot {
  master {
    name: "tryserver.chromium.linux"
    luci_bucket: "luci.chromium.try"
  }
}

both look OKish to me, with slight preference for 1) because it accumulates tech debt in a migration app which we'll delete at once anyway.

Comment 3 by no...@chromium.org, Feb 7 2018

Cc: dpranke@chromium.org
Status: Started (was: Assigned)
Dirk, I have two CLs that fix the problem for new *builders*, but not for new *masters*. How OK is that?
That sounds fine. I have no expectation that we'll want new masters in the near future.

Comment 5 by no...@chromium.org, Feb 7 2018

I meant that if we spin up a new LUCI bucket with no associated master, integrations won't work
Yes, that is what I understood you to mean.
Project Member

Comment 7 by bugdroid1@chromium.org, Feb 7 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/2eb4ff0688a7b63c49b6edaa71c66d8d323107f3

commit 2eb4ff0688a7b63c49b6edaa71c66d8d323107f3
Author: Nodir Turakulov <nodir@google.com>
Date: Wed Feb 07 23:36:09 2018

[luci-migration] add JSON response format in master handler

Add JSON response format to master handler to be able to read LUCI bucket
of a buildbot master.

Bug:  808193 
Change-Id: I737f2310381ad9b55c3b30d2f9de36a1926de6a6
Reviewed-on: https://chromium-review.googlesource.com/907375
Reviewed-by: Andrii Shyshkalov <tandrii@chromium.org>
Commit-Queue: Nodir Turakulov <nodir@chromium.org>

[modify] https://crrev.com/2eb4ff0688a7b63c49b6edaa71c66d8d323107f3/go/src/infra/appengine/luci-migration/app/master.go

Project Member

Comment 8 by bugdroid1@chromium.org, Feb 9 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-go.git/+/665a380f6bbd1cf51460a4b8f82708f1d8f9c6bd

commit 665a380f6bbd1cf51460a4b8f82708f1d8f9c6bd
Author: Nodir Turakulov <nodir@google.com>
Date: Fri Feb 09 02:44:54 2018

[milo] use new migration app endpoint to read bucket of a master

Use the new endpoint added in https://chromium-review.googlesource.com/#/c/infra/infra/+/907375

This solves  crbug.com/808193  for new builders, but not new masters.

Bug:  808193 
Change-Id: I5627195468d2077a78d5a95d13a5a402b4c79c30
Reviewed-on: https://chromium-review.googlesource.com/907376
Reviewed-by: Andrii Shyshkalov <tandrii@chromium.org>
Commit-Queue: Nodir Turakulov <nodir@chromium.org>

[modify] https://crrev.com/665a380f6bbd1cf51460a4b8f82708f1d8f9c6bd/milo/buildsource/buildbot/buildstore/build.go
[modify] https://crrev.com/665a380f6bbd1cf51460a4b8f82708f1d8f9c6bd/milo/buildsource/buildbot/buildstore/emulation.go
[modify] https://crrev.com/665a380f6bbd1cf51460a4b8f82708f1d8f9c6bd/milo/buildsource/buildbot/buildstore/query.go

Comment 9 by athom@google.com, Feb 15 2018

Are the changes above in production, yet? Or is there additional configuration required to enable this?

I would have expected to get a mail for this build but didn't: https://ci.chromium.org/p/dart/builds/b8954509610657029808

Comment 10 by no...@chromium.org, Feb 21 2018

looks like changes above were not sufficient. GK discovers builders by calling

https://ci.chromium.org/rpcexplorer/services/milo.Buildbot/GetCompressedMasterJSON?request={%20%20%20%20%22name%22:%20%22client.dart%22}

but it contains only builders present on Buildbot master. Milo also does not know about the new builder.

Comment 11 by athom@google.com, Feb 21 2018

Components: Infra>Client>Dart

Comment 12 by no...@chromium.org, Feb 21 2018

I've uploaded https://chromium-review.googlesource.com/c/infra/luci/luci-go/+/929663
but perhaps it wouldn't really work: we cannot inject all luci.foo.bar builders in a master that is being migrated to luci.foo.bar because that might be not the only master. Otherwise, for example, all luci.chromium.try builders will show up on tryserver.chromium.linux, including windows ones.

I am thinking we will have to resort to explicit whitelisting of builders in milo config. tandrii, hinoka, WDYT?

Comment 13 by no...@chromium.org, Feb 26 2018

talked to tandrii@ offline. We don't want another manually-maintained config, but use what I've started in comment 12, except will include properties in the swarmbucket response and will filter builders by mastername property when emulating a master.

Comment 14 by no...@chromium.org, Feb 27 2018

Owner: hinoka@chromium.org
Status: Assigned (was: Started)
hinoka agreed to take over 

Comment 15 by no...@chromium.org, Feb 27 2018

Cc: no...@chromium.org
Project Member

Comment 16 by bugdroid1@chromium.org, Feb 27 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-go.git/+/2e4df7a137bc82ed1b26d3db248b8908316c238b

commit 2e4df7a137bc82ed1b26d3db248b8908316c238b
Author: Nodir Turakulov <nodir@google.com>
Date: Tue Feb 27 20:19:55 2018

[milo] Export getting swarmbucket builders API.

Expose a buildbucket package function that returns swarmbucket builders
to be used in the next CL that will include swarmbucket-only builders in
emulated GetCompressedMasterJSON responses.

Rename GetAllBuilders functions to CIService because they return a
ui.CIService.

Add memcaching for loading swarmbucket builders. Cache key includes
current identity so that access control is preserved. The cache hit rate is
expected to be high for service accounts calling Buildbot API.

Bug:  808193 
Change-Id: I127070b120a172767a21f19b5fa527815716491e
Reviewed-on: https://chromium-review.googlesource.com/929663
Commit-Queue: Nodir Turakulov <nodir@chromium.org>
Reviewed-by: Ryan Tseng <hinoka@chromium.org>

[modify] https://crrev.com/2e4df7a137bc82ed1b26d3db248b8908316c238b/milo/buildsource/buildbot/builder.go
[modify] https://crrev.com/2e4df7a137bc82ed1b26d3db248b8908316c238b/milo/buildsource/buildbot/master_test.go
[modify] https://crrev.com/2e4df7a137bc82ed1b26d3db248b8908316c238b/milo/buildsource/buildbucket/buckets.go
[modify] https://crrev.com/2e4df7a137bc82ed1b26d3db248b8908316c238b/milo/frontend/view_search.go

Comment 17 by efoo@chromium.org, Feb 28 2018

Labels: LUCI-Chromium-CQSets LUCI-Blocker-Chromium-CQSets
Issue 817988 has been merged into this issue.
Status: Fixed (was: Assigned)
Should be fixed now
https://screenshot.googleplex.com/kOGa3bWsqW8

Comment 20 by athom@google.com, Mar 5 2018

The Gatekeeper step for dart is failing with a 500 Error now, is this connected?

https://logs.chromium.org/v/?s=chromium%2Fbb%2Fchromium.gatekeeper%2FChromium_Gatekeeper%2F1178881%2F%2B%2Frecipes%2Fsteps%2Fgatekeeper%3A_dart%2F0%2Fstdout

Also, when I attempt to go here: http://chrome-build-extract.appspot.com/get_master/client.dart I get a 500 error (not sure if that is supposed to work for the Dart masters).

Status: Assigned (was: Fixed)
Looking
Root cause is found, new version was rolled back in the meantime.  Fix incoming.
Status: Fixed (was: Assigned)
Now it should work.

Comment 25 by athom@google.com, Mar 6 2018

I still can't verify that this is fixed:

This build should have triggered a blamelist notification: https://ci.chromium.org/p/dart/builds/b8952821537349105888

The builder vm-kernel-linux-release-simdbc64 is in http://chrome-build-extract.appspot.com/get_master/client.dart

The user on the blamelist claims not to have received any mail (I did check his inbox myself to make sure he didn't just miss it) and I can't see any evidence of Gatekeeper reacting to the failure in the "dart" step in the Gatekeeper logs from that time (builds #1179174 - #1179176 on https://ci.chromium.org/buildbot/chromium.gatekeeper/Chromium%20Gatekeeper/).

GK requires build numbers to work. Buildbot API assumes build numbers, so API emulation requires build numbers too and ignores builds without numbers. You may have to enable them in your cr-buildbucket.cfg, at least until we have a better alternative. You may like them, though.


Comment 27 by athom@google.com, Mar 6 2018

After adding build numbers, GK became aware of the Luci builders, but still fails:
https://ci.chromium.org/buildbot/chromium.gatekeeper/Chromium%20Gatekeeper/1179561
Traceback (most recent call last):
  File "/mnt/data/b/build/scripts/slave/gatekeeper_ng.py", line 1008, in <module>
    sys.exit(main(sys.argv[1:]))
  File "/mnt/data/b/build/scripts/slave/gatekeeper_ng.py", line 998, in main
    simulate)
  File "/mnt/data/b/build/scripts/slave/gatekeeper_ng.py", line 694, in notify_failures
    build_data['revisions'] = [x['revision'] for x in build_data['changes']]
TypeError: 'NoneType' object is not iterable

A build that triggered the failure in GK:
https://ci.chromium.org/p/dart/builders/luci.dart.ci/vm-kernel-precomp-win-release-x64/1

Comment 28 by athom@google.com, Mar 7 2018

GK seems to have recovered from the failure above, but now GK fails in a different error:

Traceback (most recent call last):
  File "/mnt/data/b/build/scripts/slave/gatekeeper_ng.py", line 1008, in <module>
    sys.exit(main(sys.argv[1:]))
  File "/mnt/data/b/build/scripts/slave/gatekeeper_ng.py", line 998, in main
    simulate)
  File "/mnt/data/b/build/scripts/slave/gatekeeper_ng.py", line 738, in notify_failures
    submit_email(email_app_url, build_data, secret, simulate)
  File "/mnt/data/b/build/scripts/slave/gatekeeper_ng.py", line 491, in submit_email
    with closing(logging_urlopen(req)) as f:
  File "/mnt/data/b/build/scripts/slave/gatekeeper_ng.py", line 62, in logging_urlopen
    return urllib2.urlopen(url, *args, **kwargs)
  File "/usr/lib/python2.7/urllib2.py", line 127, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 410, in open
    response = meth(req, response)
  File "/usr/lib/python2.7/urllib2.py", line 523, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python2.7/urllib2.py", line 448, in error
    return self._call_chain(*args)
  File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 531, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 500: Internal Server Error

Failed build: https://ci.chromium.org/p/dart/builders/luci.dart.ci/vm-kernel-win-release-x64/12

GK: https://ci.chromium.org/buildbot/chromium.gatekeeper/Chromium%20Gatekeeper/1180111
#28 looks like an issue during sending submit_email, which is unrelated Luci/Milo.  I suggest filing a new GK bug.

Comment 30 by athom@google.com, Mar 7 2018

Ok, I'll file a separate bug. I reported it here because it worked fine before this fix for the Buildbot builders (and it still does). Only the Luci builders trigger the errors in GK when failing.

Comment 31 by athom@google.com, Mar 7 2018

I filed bug #819690.

Sign in to add a comment