New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 759002 link

Starred by 1 user

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Feature

Blocking:
issue 758632



Sign in to add a comment

Redesign Perf dashboard upload specific parts of chromium_test module (?)

Project Member Reported by eyaich@chromium.org, Aug 25 2017

Issue description

After discussing two overlapping efforts going on in the Chrome Speed Operations team (histogram set and one buildbot step) we have concluded that we should at least pause and think about the right chrome as a whole path forward in regards to the relationship between infra side scripts (runtest.py, upload_perf_dashboard_results.py, etc), recipe modules (chromium_tests steps.py specifically), and src side scripts (I can't remember if we have a current one that is invoked from our chromium_tests module?  Ethan, maybe your new catapult binary to generate histogram sets?)



For our current use case in chrome speed operations (one buildbot and histogram sets), we need to refactor the perf dashboard upload slightly to support our next overlapping milestones current state of the world:

Potentially 1 to n result files, each containing 1 to n benchmarks.  Each one of these files could contain legacy json (output currently by c++ perf tests), chart json, histogram set data, or a combination of the above. 

We need to collect all this data for one run of one perf waterfall configuration (that could come from n machines in the swarming pool) safely (oauth tokens could be necessary for histogram sets) to both the perf dashboard and the flakiness dashboard in 1 to n rpcs where n is the number of benchmarks (still figuring out what apis are currently available for each dashboard).  


Ok that is our use case, lets tie it back to how to design this right in recipe land.  We currently outsource to two infra side scripts: upload_perf_dashboard_results.py and results_dashboard.py (why are there two?) to upload data to the perf dashboard (not sure where we actually upload to the flakiness dashboard).  

We also need to be able to talk to catapult somewhere with specific builder information (where a builder here is information about the swarming bot it ran on ) to create the histogram sets.  We may or may not have to do this on every run, but we are moving to a model (2018 goal?) where we only support histogram set results.

That being said, my first gut feeling is that we should stop using infra side scripts (they can't be tested easily) and only use src side scripts.  But do we even need a src side script at all?  Can we do all this uploading in a perf specific recipe module that the chromium recipe calls into?   We stumbled upon this when migrating perf to swarming months ago but I don't think we ever really figured out who used it or exactly what it did or if we could re-purpose it. 

https://cs.chromium.org/chromium/build/scripts/slave/recipe_modules/perf_dashboard/



Ok this is a lot of info for a bug, but I wanted to get the ball rolling.  Dirk I am initially assigning to you for initial thoughts, feel free to ignore and I can bring it up in our weekly sync to make sure we are on track for one buildbot starting August 30th.  

Anyone else, please chime in with additional thoughts or things I missed.

Ashley, I included you in case you have any thoughts around the flakiness dashboard since I have little context there.  Please feel free to remove and add the right person for flakiness dashboard insights.
 

Comment 1 by eyaich@chromium.org, Aug 25 2017

Cc: jbudorick@chromium.org
+ jbudorick

I vaguely remember martiniss@ saying that John implemented a similar src side (?) custom merge script that is called from swarming, so he might have thoughts on scripts and the right higher level design in recipe land.
For the flakiness dashboard, the upload to test-results.appspot.com happens here:
https://cs.chromium.org/chromium/build/scripts/slave/recipe_modules/chromium_tests/steps.py?type=cs&l=885
I think most of the flakiness efforts use the json-test-results format uploaded to test-results.appspot.com and other efforts may use this data too, so I'd like to avoid moving to histogram set only (although removing other formats or reducing as much as possible would be wonderful)
The TBMv2 HistogramSet JSON format is documented in catapult: https://github.com/catapult-project/catapult/blob/master/docs/histogram-set-json-format.md
The format is stable as far as this bug is concerned afaik.

Owner: eyaich@chromium.org
Yes, we'll probably want to move to a src-side merge script that uploads the results, and we should hammer out the details of the APIs. I don't think we'll want a perf recipe module (hopefully).

Let me know how else I can help here ...
I was thinking we'd do the uploading on a src side merge script, I just wasn't sure if we should. This is mostly because of the name; a "merge" script doesn't sound to me like it should be doing RPCs to other services, but just processing local files. Maybe it's better to think of it as a post processing script?
correct, it is better to think of it as a post processing script.
It is fundamentally still responsible for merging the output shard JSONs together, hence the name. In cases where only a single shard is used, though, it is basically just postprocessing. 
(and yeah, a src-side merge script sgtm here.)
Status: Assigned (was: Untriaged)
Cc: -eakuefner@chromium.org

Sign in to add a comment