New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 654693 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Oct 2016
Cc:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 2
Type: Bug



Sign in to add a comment

Add more alerts to system_health.memory_* stories

Project Member Reported by hpayer@chromium.org, Oct 11 2016

Issue description

Right now, it seems that we only have alerts turned on for v8 metrics. Regressions in other subsystems are currently unnoticed for example this recent memory regression in skia (and there is also a little bump before which I have to triage manually). We should turn on alerts for all chrome components. Annie, can you do that?
 

Comment 1 by hpayer@chromium.org, Oct 11 2016

Cc: hablich@chromium.org
Cc: machenb...@chromium.org
Components: Infra
Cc: perezju@chromium.org
Components: -Infra
I think the best place to follow up the discussion in https://github.com/catapult-project/catapult/issues/2722

The fact that we don't monitor all the metrics is unfortunate but WAI. We did that until a while ago and that caused the sheriffs to riot against memory-infra :)
The problem is that when you have an alert in a sub-metric (say webcache) that also fires recursively alerts for (browser, renderer, all processes) x (webcache, private_dirty, partition_alloc, native_heap, pss)
so you get easily 12x the amount of alerts you should need.
There is a long term plan to express an "influencing graph", at the metric level, to group metrics but that's not there yet.

Going back to the quesiton here "why [1] did not trigger an alert", perezju explained me that the problem is that according to the current alerts config (see catapult bug) we monitor only averages per story-group and not per story (/* vs /*/* in the alert configs). So, very likely that flipboard alert got diluted by the other stories not regressing.

The latter (story vs story group alerts) is probably a genuine mistake and not WAI, and we should put it back on individual stories.

(Also, we need a better UI story for the dashboard, as navigating in my own metrics is becoming impossible. but this is another story.)

[1] https://chromeperf.appspot.com/report?sid=24541df628a91a62504b0954e454a7483e6058a8c1c78713c3dfd473a7fd017a&start_rev=405220&end_rev=424328
Owner: eyaich@chromium.org
Annie is out, so reassign this to the Automation sheriff
Owner: sullivan@chromium.org
Sorry for the long delay here, should this be closed in favor of https://github.com/catapult-project/catapult/issues/2722 ?
Status: Fixed (was: Assigned)
I would say yes. Note that skia in particular is now included in the list of alerted metrics.

Please send any further comments to the catapult bug, reopening if needed.

Sign in to add a comment