remove summary metrics from tough_video_cases, only use per page metrics |
|||||||||||||||||
Issue descriptionSee the link to graphs below.
,
May 12 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8979768684108547456
,
May 13 2017
=== Auto-CCing suspected CL author crouleau@google.com === Hi crouleau@google.com, the bisect results pointed to your CL, please take a look at the results. === BISECT JOB RESULTS === Perf regression found with culprit Suspected Commit Author : Caleb Rouleau Commit : c52544194f2bf97e72cf9215f2c3d40928929036 Date : Tue May 09 17:51:57 2017 Subject: Merge tough_video_cases_extra into tough_video_cases. Bisect Details Configuration: mac_10_12_perf_bisect Benchmark : media.tough_video_cases_tbmv2 Metric : cpu_time_percentage_avg/cpu_time_percentage_avg Change : 32.52% | 0.229632941481 -> 0.304301320532 Revision Result N chromium@470312 0.229633 +- 0.00282995 6 good chromium@470356 0.228509 +- 0.00334922 6 good chromium@470367 0.229666 +- 0.00595407 6 good chromium@470373 0.225816 +- 0.0101137 6 good chromium@470374 0.229312 +- 0.00667581 6 good chromium@470375 0.304201 +- 0.00670207 6 bad <-- chromium@470376 0.305796 +- 0.0075482 6 bad chromium@470378 0.302836 +- 0.00570879 6 bad chromium@470400 0.304301 +- 0.00503317 6 bad To Run This Test src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests media.tough_video_cases_tbmv2 Debug Info https://chromeperf.appspot.com/buildbucket_job_status/8979768684108547456 Is this bisect wrong? https://chromeperf.appspot.com/bad_bisect?try_job_id=6379667335938048 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Speed>Bisection. Thank you!
,
Jun 5 2017
+John: it seems like having this summary cpu_time_percentage_avg means that every time we add new pages in the benchmark we will get alerts like this. I wonder if it would be better to only have metrics per page but no summary. This is expected behavior, so the status is WontFix.
,
Jun 5 2017
We generally recommend per-page metrics: you don't get these kind of alerts, and the bisect will filter for just the page that regressed, so it runs faster and isn't dampened by pages with different performance characteristics.
,
Jun 5 2017
remove summary metrics from tough_video_cases_tbmv2, only use per page metrics John, can you handle fixing this?
,
Jun 5 2017
,
Jun 5 2017
Err, sorry: seems like his CL doesn't address summary metrics in general. I think the best way to avoid this kind of summary alert should be done at dashboard level. +Ben who is working on updating how summarized data is computed.
,
Jun 5 2017
Ah, should I simply make the regex from https://chromeperf.appspot.com/edit_sheriffs for my rotation not match the summary metrics?
,
Jun 5 2017
Caleb, that's exactly right; modifying the regex list for alerting is the right place to make sure that you don't get alerts about summary metrics. For example (and I think the rotation you're talking about is Chromium Perf AV Sheriff), there are currently: ChromiumPerf/*/media.tough_video_cases_tbmv2/* ChromiumPerf/*/media.tough_video_cases_tbmv2/*/* By deleting the first line, you will not receive alerts about summary metrics any more. My CL is orthogonal as Ned pointed out; I'm removing myself from this bug. Feel free to re-add if there are further questions.
,
Jun 5 2017
Thanks Ethan! I'll take this.
,
Jun 5 2017
Right now the monitoring is set to: ChromiumPerf/*/media.tough_video_cases_tbmv2/* ChromiumPerf/*/media.tough_video_cases_tbmv2/*/* That includes hundreds of metrics. It'd be better to monitor some specific high-level metrics. Charlie, can you help with guidance on power/cpu metrics? For memory metrics, we currently monitor these on the system health benchmark: memory:chrome:all_processes:reported_by_chrome:cc:effective_size_avg memory:chrome:all_processes:reported_by_chrome:gpu:effective_size_avg memory:chrome:all_processes:reported_by_chrome:malloc:effective_size_avg memory:chrome:all_processes:reported_by_chrome:skia:effective_size_avg memory:chrome:all_processes:reported_by_chrome:v8:effective_size_avg memory:chrome:all_processes:reported_by_os:system_memory:java_heap:proportional_resident_size_avg memory:chrome:all_processes:reported_by_os:system_memory:native_heap:proportional_resident_size_avg memory:chrome:all_processes:reported_by_os:system_memory:proportional_resident_size_avg
,
Jun 7 2017
As for power measurements, we've been monitoring: ChromiumPerf/*/<benchmark_name>/*energy_sum/* ChromiumPerf/*/<benchmark_name>/*power_avg/* ChromiumPerf/*/<benchmark_name>/cpu_time_percentage_avg/* This is the current minimal set of metrics that we're taking responsibility for right now and saying "if you get an alert on these metrics, and it's not a real alert, that's a problem for us and please let us know".
,
Jun 7 2017
Okay, I am going to change it from ChromiumPerf/*/media.tough_video_cases_tbmv2/* ChromiumPerf/*/media.tough_video_cases_tbmv2/*/* to ChromiumPerf/*/media.tough_video_cases_tbmv2/*energy_sum/* ChromiumPerf/*/media.tough_video_cases_tbmv2/*power_avg/* ChromiumPerf/*/media.tough_video_cases_tbmv2/cpu_time_percentage_avg/* ChromiumPerf/*/media.tough_video_cases_tbmv2/memory:chrome:all_processes:reported_by_chrome:cc:effective_size_avg/* ChromiumPerf/*/media.tough_video_cases_tbmv2/memory:chrome:all_processes:reported_by_chrome:gpu:effective_size_avg/* ChromiumPerf/*/media.tough_video_cases_tbmv2/memory:chrome:all_processes:reported_by_chrome:malloc:effective_size_avg/* ChromiumPerf/*/media.tough_video_cases_tbmv2/memory:chrome:all_processes:reported_by_chrome:skia:effective_size_avg/* ChromiumPerf/*/media.tough_video_cases_tbmv2/memory:chrome:all_processes:reported_by_chrome:v8:effective_size_avg/* ChromiumPerf/*/media.tough_video_cases_tbmv2/memory:chrome:all_processes:reported_by_os:system_memory:java_heap:proportional_resident_size_avg/* ChromiumPerf/*/media.tough_video_cases_tbmv2/memory:chrome:all_processes:reported_by_os:system_memory:native_heap:proportional_resident_size_avg/* ChromiumPerf/*/media.tough_video_cases_tbmv2/memory:chrome:all_processes:reported_by_os:system_memory:proportional_resident_size_avg/*
,
Jun 7 2017
Done. I also removed ChromiumPerf/*/media.tough_video_cases/*. I'm working on trimming the metrics that we alert on here: https://docs.google.com/document/d/1TvUWPl6diK3DTJdSWMOcdz-6qnKC6KEHAZqoBBD8lIc/edit#heading=h.pbl9canvrafx
,
Jul 20 2017
Issue 745924 has been merged into this issue.
,
Jul 20 2017
See issue 745924 . It looks like alerts for summary metrics are still happening. The line in our alerting config that is triggering this is: ChromiumPerf/*/media.tough_video_cases_tbmv2/memory:chrome:all_processes:reported_by_chrome:cc:effective_size_avg/* In comment #10, eakuefner@ suggested that the summary metrics would not match one I added the last "/*", but this seems to not be true anymore.
,
Jul 20 2017
simonhatch, any idea what could be going on here? The alerting page linked from the bug in #17 is https://chromeperf.appspot.com/group_report?sid=9686f126f87fe418f49ce0dec850d9fefacb24b3320a6a915067835c1805b8b6. It has an alert for ChromiumPerf/chromium-rel-win7-gpu-nvidia/media.tough_video_cases_tbmv2/memory:chrome:all_processes:reported_by_chrome:cc:effective_size_avg. When I type the following into the interactive dashboard console: from dashboard.common import utils t = utils.TestKey('ChromiumPerf/chromium-rel-win7-gpu-nvidia/media.tough_video_cases_tbmv2/memory:chrome:all_processes:reported_by_chrome:cc:effective_size_avg').get() print t.sheriff It prints "None" as expected. Looking at the dates on the bug, Caleb changed the alerting pattern on June 7, and the timestamp on the alert is "2017-07-14T06:57:04.119750". So it seems like this alert shouldn't have fired. Any ideas?
,
Jul 20 2017
Annie, what about
from dashboard.common import utils
t = utils.TestKey('ChromiumPerf/chromium-rel-win7-gpu-nvidia/media.tough_video_cases_tbmv2/memory:chrome:all_processes:reported_by_chrome:cc:effective_size_avg/memory:chrome:all_processes:reported_by_chrome:cc:effective_size_avg').get()
print t.sheriff
The reason I ask is that issue 745924 lists the metric as
"memory:chrome:all_processes:reported_by_chrome:cc:effective_size_avg/memory:chrome:all_processes:reported_by_chrome:cc:effective_size_avg"
,
Jul 20 2017
I think I'm missing something here; where do you see the metric name repeated with a slash like that? I attached a screenshot of what I'm seeing, which unless I'm reading wrong is what I typed in #18. When I copy/paste #19, t is None.
,
Jul 20 2017
Comment 3 of issue 745924 says: Bisect Details Configuration: winx64nvidia_perf_bisect Benchmark : media.tough_video_cases_tbmv2 Metric : memory:chrome:all_processes:reported_by_chrome:cc:effective_size_avg/memory:chrome:all_processes:reported_by_chrome:cc:effective_size_avg
,
Jul 27 2017
Issue 745938 has been merged into this issue.
,
Jul 27 2017
It's unclear what I can do to stop these alerts from coming in. Any suggestions, Simon?
,
Jul 27 2017
Sorry, didn't notice my name on here, let me look into this.
,
Jul 27 2017
re: #c19 That's just some artifacts of how bisect expects parameters. The metric memory:chrome:all_processes:reported_by_chrome:cc:effective_size_avg/memory:chrome:all_processes:reported_by_chrome:cc:effective_size_avg doesn't exist in the datastore.
,
Sep 25 2017
,
Sep 25 2017
,
Sep 25 2017
Simon filed https://github.com/catapult-project/catapult/issues/3903 to track work on this.
,
Nov 6 2017
Issue 767952 has been merged into this issue.
,
Oct 4
Think we found the root cause in the github issue a while back by sweeping the datastore and fixing the TestMetadata's.
,
Jan 16
,
Jan 16
|
|||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||
Comment 1 by jrumm...@chromium.org
, May 12 2017