New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 758999 link

Starred by 1 user

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Feature

Blocking:
issue 758632



Sign in to add a comment

Swarming recipe API to chromium_test module Redesign?

Project Member Reported by eyaich@chromium.org, Aug 25 2017

Issue description

There are two overlapping efforts going on in the Chrome Speed Operations team (histogram set and one buildbot step) that are making changes to what will come back to steps.py in the chromium recipe from swarming tasks.

Right now .../swarming/api.py writes out two files that the chromium recipe can expect: 
https://cs.chromium.org/chromium/build/scripts/slave/recipe_modules/swarming/api.py?q=swarming/api&sq=package:chromium&l=521

I added chartjson-output.json when I started the perf swarming migration.  We questioned it then, and I think we should question it more now that we are changing it again. 

We are moving to a model where chartjson-output.json (that I think Ethan is changing the name of?) is not just the results of one benchmark, but instead is the results of n benchmarks (needed for one buildbot step). 

Therefore, I think it is the right time to start re-thinking this api and what files should be there, what their names are and how we get them. 

There are at least a few seemingly straightforward solutions, everyone please chime in with other ideas or thoughts on complexity/feasibility of each: 

1) Make the swarming api just a directory instead of specific files, where we can dump as much as we want into and then just scrape it on the other side

2) Fake out swarming and making chartjson-output.json actually a zip file that contains n files, one for each benchmark

3) write n files to this output directory in swarming and add n flags just like isolated-script-test-output for each benchmark :P

Ok I think #1 is the right way to go but not sure if its feasible.  #2 could be a stop gap if is even feasible.  And #3 was the obvious bad cs design approach.  

MA please advise and feel free to assign back to me.
 

Comment 1 by eyaich@chromium.org, Aug 25 2017

Blocking: 758632
As a spin on #2, couldn't you also just make an output.json that's basically a dict of {
  run: output
}
Cc: jbudorick@chromium.org tansell@chromium.org
There is overlap here with other efforts that I'd like to be working on, what I call the "One True Step API" project.

Specifically, it'd be good if we had a fixed small number (ideally one) of generic APIs that did what we needed, rather than trying to customize things for each test type. 

There is an obvious tradeoff between the value of a single API and the fact that different test types need different things, and this is particularly true for perf vs. functional tests, so the answer might not be one, but at the very least, it'd be good to document whatever we *do* do better, rather than have it be buried in the code.

On a related note, the layout tests have a similar problem where we generate a bunch of different files as output, not just one. What we ended up doing is generating N sets of files (one per shard) and then merging them into a single final result; those final results then get uploaded to various services like cloud storage buckets and the test-results dashboards.

tansell@ has also been prototyping some work where you could fetch things from the isolateserver directly, and that might be an interesting direction to consider.
#3: I would love to see generic APIs in core code as well. Even if the test types need different things, it would be great if we can resolve the differences through polymorphism or setting a more rigid standard.

Seems like the situation here is we have K swarming shards, and each shards can produce a number of files. What are those files, where they are stored & what to do with them are highly specific to each test suite. 

So the problem here is about designing the API contract between the recipe & src side to handle this operation in such a way that we can avoid things like "if this test suite is ABC, do this" in recipe code?
> So the problem here is about designing the API contract between the recipe 
> & src side to handle this operation in such a way that we can avoid things 
> like "if this test suite is ABC, do this" in recipe code?

Yes, exactly.
(Sorry for the delay)

What you want is task triggering of related tasks (shards) and custom merging. kbr@ already implemented custom merge script support and martiniss@ implemented custom trigger script support.

Using both together enables you to do fairly "creative" handling that enables custom shards which are effectively a set of closely related but separate tasks, then custom merging enables to take the outputs and generate one coherent result set.

I feel what we miss here is a good documentation to explain how to do this especially with src/testing/buildbot/*.json.
> I feel what we miss here is a good documentation to explain how to do this 
> especially with src/testing/buildbot/*.json.

Yup.
As far as I understand things, given the custom merge (which is effectively a post process step which can make RPCs) and the custom trigger script, you can do some pretty creative things with what we want to do. I do agree that there needs to be more documentation about what the json files do. I could maybe write some of that.

I think the only thing stopping me from doing what I want specifically (perf tests) is that only output.json files are passed to merge scripts. This is specifically because of this line (https://cs.chromium.org/chromium/build/scripts/slave/recipe_modules/swarming/resources/collect_task.py?sq=package:chromium&dr&l=90) in the collect_task.py script. Is this done for a reason? I've heard that swarming has a whitelist of files that it'll expose in swarming output directories; maruel@ can you comment on that?
I think you're misinterpreting the code (if I'm understanding you correctly).

The merge script has access to all of the shards' output directories; you can pass back anything you like. The layout tests' merge script merges stuff in said directories to upload them to GCS as a single zip file, for example.

I think that code is just making sure that the files swarming itself needs (the output.json) files are there as expected.


How is the merge script supposed to access the shard output directories? The merge_cmd there only gets passed the info about the output.json files, and it seems "wrong" to guess the output directories for each shard based on the output.json paths. Is that info in the summary json file or something?
Currently the code infers the directories from the location of the output.json files:

https://cs.chromium.org/chromium/src/third_party/WebKit/Tools/Scripts/merge-layout-test-results?l=171

While I think this is safe, it's not well documented, which is an example of why we actually need the go/chromium-test-step-api work :).
Ah ok. Well, if someone else does that already, then I guess we can copy that ;)

The other big decision we have to make is to move the code we currently have in tools/build for uploading data to the perf dashboard over to chromium/src. This code currently lives in https://cs.chromium.org/chromium/build/scripts/slave/upload_perf_dashboard_results.py?sq=package:chromium&dr (and a few other files) and is rather unorganized. I think it'd be good to move this to src, but it'd be a fairly complicated change.

This also is happening at the same time that eakuefener is transitioning everything over to histogram sets, which is somewhat confusing. I should be able to move the code he wrote recently to src, but I think we'd want to make sure we aren't both changing the same things at the same time.

This new histogram set code also requires oauth2 authentication, although I don't exactly remember why. Which I'm not sure if existing merge scripts have the ability to do. 
Status: Assigned (was: Untriaged)
Cc: -eakuefner@chromium.org

Sign in to add a comment