Perf dashboard rejecting data from memory infra benchmarks |
|||||||
Issue descriptionLooks like after Petr's CL(https://codereview.chromium.org/1907343002) landed, the dashboard started rejecting data from all of: - memory.memory_health_quick - memory.top_10_mobile_tbmv2 - memory.memory_health_plan (internal) The error is, for example: Sending result 2 of 2 to dashboard. {"is_ref": false, "test_suite_name": "memory.memory_health_quick", "master": "ChromiumPerf", ... Discarding JSON, error: HTTPError: 400. Reponse: Invalid value type in chart object: u'trace' request_id:571ed38b00ff0d75b8c56b6c4b0001737e6368726f6d65706572660001636c65616e2d73756c6c6976616e2d353032316239633200010117 JSON: {"is_ref": false, "test_suite_name": "memory.memory_health_quick", "master": "ChromiumPerf", ... @@@STEP_LINK@Results Dashboard@https://chromeperf.appspot.com/report?masters=ChromiumPerf&bots=android-nexus5&tests=memory.memory_health_quick&rev=389553@@@ Error uploading to dashboard. https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/2800/steps/memory.memory_health_quick/logs/stdio Example JSON that failed to upload: https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/2800/steps/memory.memory_health_quick/logs/json.output Example JSON that *did* upload (from the previous run): https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/2799/steps/memory.memory_health_quick/logs/json.output Will try to see if there is a quick fix, otherwise will try revering :-( P0: Although data is arriving just fine to the health dashboard, without data on the perf dashboard we wont have alerts in case of regressions.
,
Apr 26 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/1c572b2ccff4c64f3e9aecd526729ffb4e89a00e commit 1c572b2ccff4c64f3e9aecd526729ffb4e89a00e Author: perezju <perezju@chromium.org> Date: Tue Apr 26 08:49:14 2016 Revert of Set up a parallel memory.memory_health_quick_tbmv2 benchmark (patchset #4 id:80001 of https://codereview.chromium.org/1907343002/ ) Reason for revert: Broke data upload to perf dashboard for these benchmarks. BUG= 606698 Original issue's description: > Set up a parallel memory.memory_health_quick_tbmv2 benchmark > > This patch adds a temporary parallel benchmark to compare the new TBMv2 > memory metric (third_party/catapult/tracing/tracing/metrics/ > system_health/memory_metric.html) with the existing TBMv1 one > (third_party/catapult/telemetry/telemetry/web_perf/metrics/ > memory_timeline.py). > > Once all issues associated with the TBMv2 metric are resolved, all > memory benchmarks will switch to use it instead of the TBMv1 metric and > the temporary parallel benchmark will be removed. > > BUG= 581716 , 606361 > CQ_EXTRA_TRYBOTS=tryserver.chromium.perf:android_s5_perf_cq;tryserver.chromium.perf:winx64_10_perf_cq;tryserver.chromium.perf:mac_retina_perf_cq;tryserver.chromium.perf:linux_perf_cq > > Committed: https://crrev.com/92efb5bc7f3830436cc047a39e79c5374b5a4d7a > Cr-Commit-Position: refs/heads/master@{#389519} TBR=primiano@chromium.org,nednguyen@google.com,petrcermak@chromium.org # Skipping CQ checks because original CL landed less than 1 days ago. NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG= 581716 , 606361 Review URL: https://codereview.chromium.org/1915163003 Cr-Commit-Position: refs/heads/master@{#389726} [modify] https://crrev.com/1c572b2ccff4c64f3e9aecd526729ffb4e89a00e/tools/perf/benchmarks/memory_infra.py [modify] https://crrev.com/1c572b2ccff4c64f3e9aecd526729ffb4e89a00e/tools/perf/page_sets/memory_health_story.py
,
Apr 26 2016
Adding telemetry folks. I guess telemetry should not be splitting traces by grouping keys? Lowering priority after revert.
,
Apr 26 2016
,
Apr 26 2016
,
Apr 26 2016
+Ethan: can you take a look?
,
Apr 26 2016
Look like the fix would be special case trace value & not add grouping key for them: https://github.com/catapult-project/catapult/blob/master/telemetry/telemetry/internal/results/page_test_results.py#L173#L193 Petr, Juan: Ethan is busy doing polymer hackathon 1.0 this week in Ann Arbor? Would any of you be willing to take over & making this quick fix?
,
Apr 27 2016
I will make a CL.
,
Apr 27 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/aab8e6c7849940255e53a59ce37335abd355bbf7 commit aab8e6c7849940255e53a59ce37335abd355bbf7 Author: catapult-deps-roller <catapult-deps-roller@chromium.org> Date: Wed Apr 27 16:59:41 2016 Roll src/third_party/catapult/ b26a0cbf1..c3f191e13 (1 commit). https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/b26a0cbf1935..c3f191e1307c $ git log b26a0cbf1..c3f191e13 --date=short --no-merges --format='%ad %ae %s' BUG= 606698 TBR=catapult-sheriff@chromium.org Review-Url: https://codereview.chromium.org/1922303003 Cr-Commit-Position: refs/heads/master@{#390109} [modify] https://crrev.com/aab8e6c7849940255e53a59ce37335abd355bbf7/DEPS
,
Apr 27 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/a2dfc3c66ad792aea2f53009c26520deb4cbfed8 commit a2dfc3c66ad792aea2f53009c26520deb4cbfed8 Author: nednguyen <nednguyen@google.com> Date: Wed Apr 27 18:10:58 2016 Reland of Set up a parallel memory.memory_health_quick_tbmv2 benchmark (patchset #1 id:1 of https://codereview.chromium.org/1915163003/ ) Reason for revert: Underlied problem should be already fixed. Original issue's description: > Revert of Set up a parallel memory.memory_health_quick_tbmv2 benchmark (patchset #4 id:80001 of https://codereview.chromium.org/1907343002/ ) > > Reason for revert: > Broke data upload to perf dashboard for these benchmarks. > > BUG= 606698 > > Original issue's description: > > Set up a parallel memory.memory_health_quick_tbmv2 benchmark > > > > This patch adds a temporary parallel benchmark to compare the new TBMv2 > > memory metric (third_party/catapult/tracing/tracing/metrics/ > > system_health/memory_metric.html) with the existing TBMv1 one > > (third_party/catapult/telemetry/telemetry/web_perf/metrics/ > > memory_timeline.py). > > > > Once all issues associated with the TBMv2 metric are resolved, all > > memory benchmarks will switch to use it instead of the TBMv1 metric and > > the temporary parallel benchmark will be removed. > > > > BUG= 581716 , 606361 > > CQ_EXTRA_TRYBOTS=tryserver.chromium.perf:android_s5_perf_cq;tryserver.chromium.perf:winx64_10_perf_cq;tryserver.chromium.perf:mac_retina_perf_cq;tryserver.chromium.perf:linux_perf_cq > > > > Committed: https://crrev.com/92efb5bc7f3830436cc047a39e79c5374b5a4d7a > > Cr-Commit-Position: refs/heads/master@{#389519} > > TBR=primiano@chromium.org,nednguyen@google.com,petrcermak@chromium.org > # Skipping CQ checks because original CL landed less than 1 days ago. > NOPRESUBMIT=true > NOTREECHECKS=true > NOTRY=true > BUG= 581716 , 606361 > > Committed: https://crrev.com/1c572b2ccff4c64f3e9aecd526729ffb4e89a00e > Cr-Commit-Position: refs/heads/master@{#389726} TBR=primiano@chromium.org,petrcermak@chromium.org,perezju@chromium.org # Not skipping CQ checks because original CL landed more than 1 days ago. BUG= 606698 Review-Url: https://codereview.chromium.org/1920363003 Cr-Commit-Position: refs/heads/master@{#390126} [modify] https://crrev.com/a2dfc3c66ad792aea2f53009c26520deb4cbfed8/tools/perf/benchmarks/memory_infra.py [modify] https://crrev.com/a2dfc3c66ad792aea2f53009c26520deb4cbfed8/tools/perf/page_sets/memory_health_story.py
,
Apr 28 2016
Fix landed and tests are working fine this time :)
,
Apr 28 2016
Here are the data reported by the parallel benchmark on Nexus 5: https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%282%29/builds/2652/steps/memory.top_10_mobile_tbmv2/logs/json.output However, it seems to me that, unlike the TBMv1 memory benchmark data (e.g. https://chromeperf.appspot.com/report?sid=9396c22951b4d0c26ae0ef1b0c84c3f8b6185b60c87625fdcba5071e1a354d3c is up to date), the TBMv2 memory benchmark data is currently NOT exported to the dashboard: When I select "memory.top_10_mobile_tbmv2" as the test suite and "android-nexus5" as the bot, the third dropdown on the dashboard keeps loading for a while and then disappears (suggesting that the TBMv2 memory benchmark results were empty). Furthermore, unlike the TBMv1 benchmark, which shows metric results on the build page (https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/2828): +----------------------------------------------------------------------+ 53. | memory.memory_health_quick memory.memory_health_quick | | memory.memory_health_quick | | | | background@@memory_allocated_objects_partition_alloc_renderer: 1.45M | | background@@memory_allocated_objects_partition_alloc_total: 1.45M | | background@@memory_allocated_objects_v8_renderer: 1.75M | | background@@memory_allocated_objects_v8_total: 1.75M | | [...] | | foreground@@process_count_renderer: 1.0 | | foreground@@process_count_total: 3.0 | | | | Device Affinity: 3 | +----------------------------------------------------------------------+ * stdio * json.output * Results Dashboard the TBMv2 benchmark does NOT show any metric results on the build page (https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%282%29/builds/2652): +----------------------------------------------------------------------+ 42. | memory.top_10_mobile_tbmv2 memory.top_10_mobile_tbmv2 | | memory.top_10_mobile_tbmv2 | | | | Device Affinity: 1 | +----------------------------------------------------------------------+ * stdio * json.output * Results Dashboard nednguyen, eakuefner: Is this a problem with the TBMv2 benchmark (e.g. using ":" in value names) or the dashboard?
,
Apr 28 2016
Hmm.. that's weird, I was able to select some graphs: https://chromeperf.appspot.com/report?sid=5762114bab0c4783f176b0997b38cee796b55a9e6da9261bbcb6a269ed2c3978 But I did notice: - Sometimes, the dropdown to select a metric does not load (as happened to Petr) - When it loads, auto-complete seems somewhat broken, for example typing "overall" or "pss" shows nothing, I had to almost type in its entirety "memory:chrome:all:vmstats:overall:pss_avg". - Sometimes graphs end up with the error "Failed to fetch graph data", even if the chart was already shown. I'm guessing this could be more of an issue that we're sending lots and lots of metrics? (for every metric now we end up sending _std, _avg, _sum, etc..) Also not sure whether to re-open this, or file one or more new issues.
,
Apr 28 2016
Annie: can you help take a look at the dashboard issue?
,
Apr 28 2016
I filed https://github.com/catapult-project/catapult/issues/2286 for the dashboard problem. |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by perezju@chromium.org
, Apr 26 2016Will have to revert, I don't think there is an easy fix. The problem is that, before, traces in the output json used to look like: "trace": { "after_http_amazon_com": { "cloud_url": "https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_15-2016-04-25_12-22-06-3934.html", ... But now they are being split as: "background@@trace": { "after_http_amazon_com": { "cloud_url": "https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_15-2016-04-25_18-30-40-86938.html", ... "foreground@@trace": { "after_http_amazon_com": { "cloud_url": "https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_15-2016-04-25_18-30-40-86938.html", ...