New issue
Advanced search Search tips

Issue 863144 link

Starred by 1 user

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Feature



Sign in to add a comment

bot name missing in logdog url

Project Member Reported by thakis@chromium.org, Jul 12

Issue description

What steps will reproduce the problem?
1. Open a compile stdout link, e.g. https://logs.chromium.org/v/?s=chromium%2Fbuildbucket%2Fcr-buildbucket.appspot.com%2F8941183393593542432%2F%2B%2Fsteps%2Fcompile__with_patch_%2F0%2Fstdout


What is the expected result?


Name of bot doing compile is present in URL.


What happens instead of that?

There's just some opaque ID.

This used to work. I often have the compile output from many bots open in many tabs, and now I can't ctrl-tab through my tabs to check them but have to scroll up and look at the ninja invocation to guess which bot this is running on.


Please provide any additional information below. Attach a screenshot if
possible.

UserAgentString: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36



 
Did you mean bot name (name of machine, which is swarm1168-c4 here) or builder name (win10_chromium_x64_rel_ng)?

I don't think we've ever shown the bot name in the URL before.  Can you link an example of one that did?
"win10_chromium_x64_rel_ng" (that used to be called "bot name" in buildbot parlance -- it's a bit unfortunate that the new system uses the same word for something different :-P)

Bug 795156 makes linking to old logs a bit difficult. But comment 23 on that bug mentions "https://logs.chromium.org/v/?s=chromium%2Fbb%2Ftryserver.chromium.win%2Fwin10_chromium_x64_rel_ng%2F111186%2F%2B%2Frecipes%2Fsteps%2Fanalyze%2F0%2Fstdout" which does show the bot name in the url, even if the data at that URL doesn't load due to that bug.
Cc: no...@chromium.org
It actually used to be called "slave" in buildbot.  I don't think we as infrastructure have ever called it "bot", but I'm happy to be wrong.

We stopped using "slave" in favor of "bot" due to the historical connotation of the word with slavery.

I see, yeah the luci URLs use buildbucket ID instead of buildbucket address.  Considering URLs are part of the user experience, it might make sense to use the buildbucket address instead.  (So it'd be cr-buildbucket.appspot.com/luci.chromium.try/win10_chromium_x64_rel_ng/48605, which technically isn't a legal logdog prefix since it has 4 components, but that's a problem we can probably solve)
+nodir: wdyt?
Sorry let me correct myself.

(Buildbot) Builder name - Name of the builder
(Buildbot) Bot name - Has no meaning.  Some people outside of infra have used this as Builder name, but we have never done so as infra
(Buildbot) Slave name - Name of the machine
(LUCI) Builder name - Name of the builder
(LUCI) Bot name - Name of the machine
(LUCI) Slave name - Not used.
Moving off "slave" sounds great, but "machine name" or similar wouldn't have collided with the old naming. Ah well, way too late to do anything about it anyways :-)
Just to make things more confusing, in this new world, a "Machine" refers to a physical computer or VM, while a bot refers to an execution instance.  This means that a "Machine" has the ability to run multiple "Bots"

My hope is that through the inevitability of time, this will all somehow become normalized.
>  which technically isn't a legal logdog prefix since it has 4 components

logdog prefix has a limit on a number of components? first time I hear about this.

---

we could add replace build id with build address in the logdog prefix, but that sounds like fixing a presentation problem at the much lower level. Could we print stream tags/metadata in the top of the page? Then we can include more interesting information, not only builder name, e.g. swarming bot id, its dimensions, etc.
thakis, the issue description says it is not in the logdog URL. Does it have to be specifically in the URL, or its present on the page would be sufficient?
both would be ideal. i often am scrolled to the bottom of the build output on a bunch of bots and ctrl-tab through my tabs to compare.
"having both would be ideal", meaning having it in the url would be super useful for me.
Cc: p...@chromium.org
Having it in the URL would also make it much easier to script interactions with logdog that involve fetching logs over a sequence of builds.
re 11: This isn't the behavior we'd want to encourage since build addresses aren't necessarily (or won't necessarily be) the canonical "sequence" of a builder.  The canonical sequences are stored in buildbucket, and should be queried.  Scripts may break if they make this assumption. 
hinoka, could you elaborate? there are different ways builds of a builder can be ordered (by creation time, but commit position). Build numbers are and will continue to be reflect build creation order.

pcc, when enumerating builds between build numbers A and B, do you assume that builds are ordered by commit position? Such assumption would be false if someone schedules builds for an older revision.

In general, the format of a logdog URL in a build is not subject to backward compatibility and we'd like to be able to change it without announcements and migrations. This is not a public API. There is public API to retrieve builds with their logdog URLs though. e.g. http://shortn/_fyZsKEyLtg

here is a commandline equivalent:

prpc call cr-buildbucket.appspot.com buildbucket.v2.Builds.SearchBuilds <<END
{
    "predicate": {
        "builder": {
            "project": "chromium",
            "bucket": "try",
            "builder": "win10_chromium_x64_rel_ng"
        },
        "status": "ENDED_MASK"
    },
    "fields": "builds.*.id,builds.*.number,builds.*.steps.*.name,builds.*.steps.*.logs"
}
END

this retrieve most recent 10 builds from win10_chromium_x64_rel_ng builder and includes step logs, build id and build number in the response (by default steps are omitted).

We can add logdog stream name to the step log and in the future we will add ordering by commit position.
Yes that's what I mean.  Build number sequences generally map to commit order, but that's not likely to be a guarantee. 
The scripting that I've done has all been one-offs against continuous fyi or perf builders (i.e. not trybots) where I know that the builds are sequential in commit position, or where I'm trying to compute some statistic using recent build logs where it doesn't really matter what order they're in. Your being able to change the format of the URL is fine for my purposes as long as there's a reasonably straightforward (if potentially unstable) way to fetch the log for step X for the last N builds. That almost seems possible with the RPC that you mention, except that:

1) it's not really obvious how I get from an URL like https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Win to an RPC so if I'm scripting something I'm probably going to go with the path of least resistance and crudely screen scrape the LUCI pages with something like:

$ curl `curl -s https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Win/70637 | grep -o https://.*steps/compile/0/stdout | sed -e 's/&#43;/+/'`

(obviously I wouldn't ever deploy or use that anywhere outside my one-offs)

2) the URLs in the response don't appear to be for the plain text logs, but that's probably easily fixable

3) N seems to be limited to 100, I sometimes need more than that and I'd prefer not to have to script maintaining the page token between requests.
I see, one-offs aren't too bad.  We're mostly sensitive about systems on top of this with differing assumptions.
Project Member

Comment 17 by bugdroid1@chromium.org, Jul 18

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/0ae7a3c2b522db3d196ccd119e760e972b0c4854

commit 0ae7a3c2b522db3d196ccd119e760e972b0c4854
Author: Nodir Turakulov <nodir@google.com>
Date: Wed Jul 18 22:18:06 2018

[prpc] Fix JSONPB prefix

https://chromium-review.googlesource.com/c/infra/luci/luci-go/+/1142513
updated the prpc protocol to inject `)]}'\n` prefix, not `)]}'`.
Update pRPC for Python.

Bug: 863144
Change-Id: I666110c9c7ea99f29bb4673b814616704f31b437
Reviewed-on: https://chromium-review.googlesource.com/1142521
Reviewed-by: Vadim Shtayura <vadimsh@chromium.org>
Commit-Queue: Nodir Turakulov <nodir@chromium.org>

[modify] https://crrev.com/0ae7a3c2b522db3d196ccd119e760e972b0c4854/appengine/components/components/prpc/encoding.py

Project Member

Comment 18 by bugdroid1@chromium.org, Jul 18

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-go.git/+/3536cde6803a041daca7dfccb97ec15b6fc32cc6

commit 3536cde6803a041daca7dfccb97ec15b6fc32cc6
Author: Nodir Turakulov <nodir@google.com>
Date: Wed Jul 18 22:30:35 2018

[buidbucket] Update SearchBuildsRequest.page_size doc

We no longer limit builds in the response to 100.
Also change default to 100.

Bug: 863144, 849358
Change-Id: I9d130f9679545cc9067f8c27c20983e41a158f4e
Reviewed-on: https://chromium-review.googlesource.com/1142530
Reviewed-by: Vadim Shtayura <vadimsh@chromium.org>
Commit-Queue: Nodir Turakulov <nodir@chromium.org>

[modify] https://crrev.com/3536cde6803a041daca7dfccb97ec15b6fc32cc6/buildbucket/proto/pb.discovery.go
[modify] https://crrev.com/3536cde6803a041daca7dfccb97ec15b6fc32cc6/buildbucket/proto/rpc.pb.go
[modify] https://crrev.com/3536cde6803a041daca7dfccb97ec15b6fc32cc6/buildbucket/proto/rpc.proto

Project Member

Comment 19 by bugdroid1@chromium.org, Jul 18

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/80919189fe478c9cabbf09ce7eacb455b74c4c68

commit 80919189fe478c9cabbf09ce7eacb455b74c4c68
Author: Nodir Turakulov <nodir@google.com>
Date: Wed Jul 18 22:34:00 2018

[buildbucket] Remove max_builds limit and update default

Clients do need to load > 100 builds in a search query. Do not make them
do paging. In the worst case, they will hit timeout and will have to do paging.

Also change the default to 100.

R=vadimsh@chromium.org

Bug: 863144, 849358
Change-Id: Ia339fd6d16374b2c94580777f16a8bc165cb0fc3
Reviewed-on: https://chromium-review.googlesource.com/1142516
Commit-Queue: Nodir Turakulov <nodir@chromium.org>
Reviewed-by: Vadim Shtayura <vadimsh@chromium.org>

[modify] https://crrev.com/80919189fe478c9cabbf09ce7eacb455b74c4c68/appengine/cr-buildbucket/test/service_test.py
[modify] https://crrev.com/80919189fe478c9cabbf09ce7eacb455b74c4c68/appengine/cr-buildbucket/search.py

in https://chromium-review.googlesource.com/c/infra/infra/+/1142568#message-8d22c14f4d431bd7f60fd5de699acb06c0e750fc
i tried to embed builder name in logdog prefix, but it turned out to be problematic. See why the CL was abandoned.

order of builds in *CI* builders is not guaranteed to match commit position order. Users already ask us to provide a button to schedule a build for arbitrary revision (bug 861913) and we'd like to provide that functionality. That would break the assumption.

1) i think this is fixable and in general should not be a reason to expose more APIs where existing public API cover the need. The RPC Explorer (link above) prints documentation if you press Ctrl+Space and you can IM me for questions about that API. Also there are docs https://chromium.googlesource.com/infra/luci/luci-go/+/master/buildbucket/proto/rpc.proto
In the URL, the most confusing part is "luci.chromium.ci" which will be changed to just "ci". That's the bucket. Overall the URL path format is /p/{project}/builders/{bucket}/{builder} and you can use those values in the request. So for completed Win builds, the predicate is
    {
        "builder": {
            "project": "chromium",
            "bucket": "ci",
            "builder": "Win"
        },
        "status": "ENDED_MASK"
    }

2) true, adding that is a part of the plan. In the meantime, you can use returned build ids to construct the logdog prefix.

3) removed the limit and changed the default to 100

--

The following will print build ids of 200 most recent completed builds of Win builder to build_ids file:

prpc call cr-buildbucket.appspot.com buildbucket.v2.Builds.SearchBuilds <<END |
{
    "predicate": {
        "builder": {
            "project": "chromium",
            "bucket": "ci",
            "builder": "Win"
        },
        "status": "ENDED_MASK"
    },
    "page_size": 200
}
END
jq -r '.builds[].id' > build_ids

Until (2) work is done, you can use those ids in the 
  https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket.appspot.com/{build_id}/+/steps/bot_update/0/stdout
URL template (note build_id there), e.g.

while read id
do
  curl -s https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket.appspot.com/$id/+/steps/bot_update/0/stdout
done < build_ids

the most recent deployment of buildbucket had to be reverted, so the prpc command above wouldn't work, but will work once the changes are redeployed
Project Member

Comment 22 by bugdroid1@chromium.org, Jul 19

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/0be3cf078974f078ef7eccc372dde4b1e735e4ec

commit 0be3cf078974f078ef7eccc372dde4b1e735e4ec
Author: Nodir Turakulov <nodir@chromium.org>
Date: Thu Jul 19 00:13:12 2018

Revert "[buildbucket] Remove max_builds limit and update default"

This reverts commit 80919189fe478c9cabbf09ce7eacb455b74c4c68.

Reason for revert: now we hit datastore constraints on "limit" parameter

Original change's description:
> [buildbucket] Remove max_builds limit and update default
> 
> Clients do need to load > 100 builds in a search query. Do not make them
> do paging. In the worst case, they will hit timeout and will have to do paging.
> 
> Also change the default to 100.
> 
> R=​vadimsh@chromium.org
> 
> Bug: 863144, 849358
> Change-Id: Ia339fd6d16374b2c94580777f16a8bc165cb0fc3
> Reviewed-on: https://chromium-review.googlesource.com/1142516
> Commit-Queue: Nodir Turakulov <nodir@chromium.org>
> Reviewed-by: Vadim Shtayura <vadimsh@chromium.org>

TBR=vadimsh@chromium.org,nodir@chromium.org

Change-Id: Iae54726209e074dbc3460710f70b8ab3b48e59f6
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: 863144, 849358
Reviewed-on: https://chromium-review.googlesource.com/1142021
Reviewed-by: Nodir Turakulov <nodir@chromium.org>
Commit-Queue: Nodir Turakulov <nodir@chromium.org>

[modify] https://crrev.com/0be3cf078974f078ef7eccc372dde4b1e735e4ec/appengine/cr-buildbucket/test/service_test.py
[modify] https://crrev.com/0be3cf078974f078ef7eccc372dde4b1e735e4ec/appengine/cr-buildbucket/search.py

Project Member

Comment 23 by bugdroid1@chromium.org, Jul 19

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-go.git/+/9f465569ac375f16b609177b1b7cd4888de32187

commit 9f465569ac375f16b609177b1b7cd4888de32187
Author: Nodir Turakulov <nodir@google.com>
Date: Thu Jul 19 00:50:55 2018

[buildbucket] Update SearchBuildsRequest.page_size doc

page_size field is limited to 1000.

R=vadimsh@chromium.org

Bug: 863144, 849358
Change-Id: I4179cb8efce2cd41cc3d1ef70f6ddec70423ae5c
Reviewed-on: https://chromium-review.googlesource.com/1142731
Reviewed-by: Vadim Shtayura <vadimsh@chromium.org>
Commit-Queue: Nodir Turakulov <nodir@chromium.org>

[modify] https://crrev.com/9f465569ac375f16b609177b1b7cd4888de32187/buildbucket/proto/pb.discovery.go
[modify] https://crrev.com/9f465569ac375f16b609177b1b7cd4888de32187/buildbucket/proto/rpc.pb.go
[modify] https://crrev.com/9f465569ac375f16b609177b1b7cd4888de32187/buildbucket/proto/rpc.proto

Project Member

Comment 24 by bugdroid1@chromium.org, Jul 19

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/57fb6c62855293be9757a1c7b8bbacbd5eaa53d2

commit 57fb6c62855293be9757a1c7b8bbacbd5eaa53d2
Author: Nodir Turakulov <nodir@google.com>
Date: Thu Jul 19 01:10:02 2018

[buildbucket] Change max_builds max to 1000

Clients do need to load > 100 builds in a search query. Do not make them
do paging. In the worst case, they will hit timeout and will have to do paging.
Change max value of max_builds to 1000.

Also change the default to 100.

Bug: 863144, 849358
Change-Id: I56b4a9520e970be6db706233d112bce5d3e3127b
Reviewed-on: https://chromium-review.googlesource.com/1142727
Commit-Queue: Nodir Turakulov <nodir@chromium.org>
Reviewed-by: Vadim Shtayura <vadimsh@chromium.org>

[modify] https://crrev.com/57fb6c62855293be9757a1c7b8bbacbd5eaa53d2/appengine/cr-buildbucket/proto/build_pb2.py
[modify] https://crrev.com/57fb6c62855293be9757a1c7b8bbacbd5eaa53d2/appengine/cr-buildbucket/test/service_test.py
[modify] https://crrev.com/57fb6c62855293be9757a1c7b8bbacbd5eaa53d2/appengine/cr-buildbucket/proto/rpc_pb2.py
[modify] https://crrev.com/57fb6c62855293be9757a1c7b8bbacbd5eaa53d2/appengine/cr-buildbucket/search.py
[modify] https://crrev.com/57fb6c62855293be9757a1c7b8bbacbd5eaa53d2/appengine/cr-buildbucket/proto/rpc_prpc_pb2.py

the prpc commands above work again
Cc: -no...@chromium.org
Labels: -OS-Linux
Owner: no...@chromium.org
Status: Assigned (was: Untriaged)
Assigning to Nodir since he worked on this.
Re nodir comment 3 "We stopped using "slave" in favor of "bot" due to the historical connotation of the word with slavery": goto.google.com/heytrooper still has "slave restart" and that template stills says "slave name". That should probably be updated too?
Yes, those should be updated too. Thanks for noticing.
Labels: -Type-Bug Type-Feature
Owner: ----
Status: Available (was: Assigned)

Sign in to add a comment