New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 630006 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Feb 2017
Cc:
Components:
EstimatedDays: ----
NextAction: 2017-02-21
OS: ----
Pri: 1
Type: Feature

Blocked on:
issue 651497
issue 404338


Participants' hotlists:
speed-ops-backlog


Sign in to add a comment

Set up sheriff-o-matic for the Perf team

Project Member Reported by benhenry@chromium.org, Jul 20 2016

Issue description

Perf has a sheriff rotation, but are doing a lot of things manually and could benefit from both sheriff-o-matic and more Monitoring/instrumentation help so they wouldn't have to file bugs for troopers when things break.

This bug is specifically about getting sheriff-o-matic up for the Perf team.
 
Cc: dtu@chromium.org nednguyen@chromium.org martiniss@chromium.org
+dtu, nednguyen

Note that it's already set up, but we have some problems specific to perf waterfalls that will likely need to be addressed:

* We are waiting for the new UI that we saw in staging because it does better grouping of related falures (multiple bots failed the same test starting at same revision range, multiple tests with different names failing at the same revision range)
* We have some flaky tests that fail say, every fifth run. Ideally there would be an easy way to see that the failure is an ongoing issue. (Does integrating with flakiness dashboard help?)
* Currently when a device fails on our waterfall, we list all tests that should have run on that device as failed. We think that if we change our recipe to output red/green steps for devices separate from red/green steps for tests, sheriff-o-matic would be able to display device issues separately from test issues.
Blockedon: 404338
Owner: martiniss@chromium.org
Status: Assigned (was: Untriaged)
+rnephew

(commenting on each point in #1)

* To confirm, your team isn't using the current UI at https://sheriff-o-matic.appspot.com/chromium.perf right now, correct?

* Integrating with the flakiness dashboard won't do anything right now. It could though; see  bug 404338 , and feel free to comment on that bug if you have suggestions for what that would look like.

* I had taken a look at device failures, which seemed to be cluttering up the perf tree on SOM a lot. I sent out an email a week or so ago, and from what I understand rnephew is working on a refactor which should help with a lot of the device failure steps which clutter up SOM. Randy, is there a bug on file for that work? John budorick said that you've turned this on for S5s today?


It's correct that we're not using the current UI.

Thanks for linking the flakiness dashboard bug, I updated it to explain our use case a bit more deeply.

Randy, would be interested to hear how your refactor will help (and see an update in the doc: https://docs.google.com/document/d/1E8a2x1CWcxDOid-9_OVodHRmr0II-4v5MxEPKDdlV34/edit)
Labels: Milestone-PostSoMNG
(assigning milestones)
I've made a local prototype of what the perf tree would look like on SOM; https://screenshot.googleplex.com/OMB4Xb1Yq7j.png
Neat! Feel free to mail to perf-sheriffs@chromium.org for feedback!
Can we make the failure to per benchmark & page basis? A benchmark in telemetry = a test suite that contains many test cases.
Just to clarify comment #8: Perf benchmarks are usually named like <metric>.<page_set>, where <metric> is the main thing being measured (smoothness, thread_times, system_health) and page_set is a group of web pages we are checking that metric on (top_10_mobile, tough_ad_cases). What we'd love to be able to do is quickly look at the waterfall and correctly classify failures like:

"All smoothness benchmarks are failing on android"
"All facebook.com test cases are failing"
"key_mobile_sites pageset is failing on Mac"

Where the middle one is tricky because facebook.com is a specific site that could be in multiple pagesets. I think we had briefly talked about whether we can send data about what pages are running to flakiness dashboard to give it and sheriff-o-matic a better idea of test structure.
Blockedon: 651497
Labels: -Milestone-PostSoMNG perf
Labels: -perf Milestone-Perf
Ping - please provide an update to your high priority bug. This bug is stale. Is it really P-1?
NextAction: 2017-01-12
NextAction: 2017-02-21
Status: Fixed (was: Assigned)
This is pretty much done. I think we can close it?
Yep! We should be filing follow-up bugs for specific problems.

Sign in to add a comment