New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 606698 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Closed: Apr 2016
Cc:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 1
Type: Bug

Blocking:
issue 581716
issue 606361



Sign in to add a comment

Perf dashboard rejecting data from memory infra benchmarks

Project Member Reported by perezju@chromium.org, Apr 26 2016

Issue description

Looks like after Petr's CL(https://codereview.chromium.org/1907343002) landed, the dashboard started rejecting data from all of:

- memory.memory_health_quick
- memory.top_10_mobile_tbmv2
- memory.memory_health_plan (internal)

The error is, for example:

Sending result 2 of 2 to dashboard.
{"is_ref": false, "test_suite_name": "memory.memory_health_quick", "master": "ChromiumPerf", ...
Discarding JSON, error:
HTTPError: 400. Reponse: Invalid value type in chart object: u'trace'
request_id:571ed38b00ff0d75b8c56b6c4b0001737e6368726f6d65706572660001636c65616e2d73756c6c6976616e2d353032316239633200010117

JSON: {"is_ref": false, "test_suite_name": "memory.memory_health_quick", "master": "ChromiumPerf", ...

@@@STEP_LINK@Results Dashboard@https://chromeperf.appspot.com/report?masters=ChromiumPerf&bots=android-nexus5&tests=memory.memory_health_quick&rev=389553@@@
Error uploading to dashboard.
https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/2800/steps/memory.memory_health_quick/logs/stdio

Example JSON that failed to upload:
https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/2800/steps/memory.memory_health_quick/logs/json.output

Example JSON that *did* upload (from the previous run):
https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/2799/steps/memory.memory_health_quick/logs/json.output

Will try to see if there is a quick fix, otherwise will try revering :-(

P0: Although data is arriving just fine to the health dashboard, without data on the perf dashboard we wont have alerts in case of regressions.
 
Will have to revert, I don't think there is an easy fix.

The problem is that, before, traces in the output json used to look like:

      "trace": {
        "after_http_amazon_com": {
          "cloud_url": "https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_15-2016-04-25_12-22-06-3934.html", 
          ...

But now they are being split as:

      "background@@trace": {
        "after_http_amazon_com": {
          "cloud_url": "https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_15-2016-04-25_18-30-40-86938.html", 
          ...

      "foreground@@trace": {
        "after_http_amazon_com": {
          "cloud_url": "https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_15-2016-04-25_18-30-40-86938.html", 
          ...

Project Member

Comment 2 by bugdroid1@chromium.org, Apr 26 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/1c572b2ccff4c64f3e9aecd526729ffb4e89a00e

commit 1c572b2ccff4c64f3e9aecd526729ffb4e89a00e
Author: perezju <perezju@chromium.org>
Date: Tue Apr 26 08:49:14 2016

Revert of Set up a parallel memory.memory_health_quick_tbmv2 benchmark (patchset #4 id:80001 of https://codereview.chromium.org/1907343002/ )

Reason for revert:
Broke data upload to perf dashboard for these benchmarks.

BUG= 606698 

Original issue's description:
> Set up a parallel memory.memory_health_quick_tbmv2 benchmark
>
> This patch adds a temporary parallel benchmark to compare the new TBMv2
> memory metric (third_party/catapult/tracing/tracing/metrics/
> system_health/memory_metric.html) with the existing TBMv1 one
> (third_party/catapult/telemetry/telemetry/web_perf/metrics/
> memory_timeline.py).
>
> Once all issues associated with the TBMv2 metric are resolved, all
> memory benchmarks will switch to use it instead of the TBMv1 metric and
> the temporary parallel benchmark will be removed.
>
> BUG= 581716 , 606361 
> CQ_EXTRA_TRYBOTS=tryserver.chromium.perf:android_s5_perf_cq;tryserver.chromium.perf:winx64_10_perf_cq;tryserver.chromium.perf:mac_retina_perf_cq;tryserver.chromium.perf:linux_perf_cq
>
> Committed: https://crrev.com/92efb5bc7f3830436cc047a39e79c5374b5a4d7a
> Cr-Commit-Position: refs/heads/master@{#389519}

TBR=primiano@chromium.org,nednguyen@google.com,petrcermak@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG= 581716 , 606361 

Review URL: https://codereview.chromium.org/1915163003

Cr-Commit-Position: refs/heads/master@{#389726}

[modify] https://crrev.com/1c572b2ccff4c64f3e9aecd526729ffb4e89a00e/tools/perf/benchmarks/memory_infra.py
[modify] https://crrev.com/1c572b2ccff4c64f3e9aecd526729ffb4e89a00e/tools/perf/page_sets/memory_health_story.py

Cc: eakuefner@chromium.org
Labels: -Pri-0 Pri-2
Owner: nednguyen@chromium.org
Adding telemetry folks. I guess telemetry should not be splitting traces by grouping keys?

Lowering priority after revert.
Blocking: 606361 581716
Cc: petrcermak@chromium.org
Labels: -Pri-2 Pri-1
Cc: nedngu...@google.com
Owner: eakuefner@chromium.org
Status: Assigned (was: Untriaged)
+Ethan: can you take a look?
Look like the fix would be special case trace value & not add grouping key for them: https://github.com/catapult-project/catapult/blob/master/telemetry/telemetry/internal/results/page_test_results.py#L173#L193

Petr, Juan: Ethan is busy doing polymer hackathon 1.0 this week in Ann Arbor? Would any of you be willing to take over & making this quick fix?
Owner: nedngu...@google.com
I will make a CL.
Project Member

Comment 10 by bugdroid1@chromium.org, Apr 27 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/aab8e6c7849940255e53a59ce37335abd355bbf7

commit aab8e6c7849940255e53a59ce37335abd355bbf7
Author: catapult-deps-roller <catapult-deps-roller@chromium.org>
Date: Wed Apr 27 16:59:41 2016

Roll src/third_party/catapult/ b26a0cbf1..c3f191e13 (1 commit).

https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/b26a0cbf1935..c3f191e1307c

$ git log b26a0cbf1..c3f191e13 --date=short --no-merges --format='%ad %ae %s'

BUG= 606698 

TBR=catapult-sheriff@chromium.org

Review-Url: https://codereview.chromium.org/1922303003
Cr-Commit-Position: refs/heads/master@{#390109}

[modify] https://crrev.com/aab8e6c7849940255e53a59ce37335abd355bbf7/DEPS

Project Member

Comment 11 by bugdroid1@chromium.org, Apr 27 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/a2dfc3c66ad792aea2f53009c26520deb4cbfed8

commit a2dfc3c66ad792aea2f53009c26520deb4cbfed8
Author: nednguyen <nednguyen@google.com>
Date: Wed Apr 27 18:10:58 2016

Reland of Set up a parallel memory.memory_health_quick_tbmv2 benchmark (patchset #1 id:1 of https://codereview.chromium.org/1915163003/ )

Reason for revert:
Underlied problem should be already fixed.

Original issue's description:
> Revert of Set up a parallel memory.memory_health_quick_tbmv2 benchmark (patchset #4 id:80001 of https://codereview.chromium.org/1907343002/ )
>
> Reason for revert:
> Broke data upload to perf dashboard for these benchmarks.
>
> BUG= 606698 
>
> Original issue's description:
> > Set up a parallel memory.memory_health_quick_tbmv2 benchmark
> >
> > This patch adds a temporary parallel benchmark to compare the new TBMv2
> > memory metric (third_party/catapult/tracing/tracing/metrics/
> > system_health/memory_metric.html) with the existing TBMv1 one
> > (third_party/catapult/telemetry/telemetry/web_perf/metrics/
> > memory_timeline.py).
> >
> > Once all issues associated with the TBMv2 metric are resolved, all
> > memory benchmarks will switch to use it instead of the TBMv1 metric and
> > the temporary parallel benchmark will be removed.
> >
> > BUG= 581716 , 606361 
> > CQ_EXTRA_TRYBOTS=tryserver.chromium.perf:android_s5_perf_cq;tryserver.chromium.perf:winx64_10_perf_cq;tryserver.chromium.perf:mac_retina_perf_cq;tryserver.chromium.perf:linux_perf_cq
> >
> > Committed: https://crrev.com/92efb5bc7f3830436cc047a39e79c5374b5a4d7a
> > Cr-Commit-Position: refs/heads/master@{#389519}
>
> TBR=primiano@chromium.org,nednguyen@google.com,petrcermak@chromium.org
> # Skipping CQ checks because original CL landed less than 1 days ago.
> NOPRESUBMIT=true
> NOTREECHECKS=true
> NOTRY=true
> BUG= 581716 , 606361 
>
> Committed: https://crrev.com/1c572b2ccff4c64f3e9aecd526729ffb4e89a00e
> Cr-Commit-Position: refs/heads/master@{#389726}

TBR=primiano@chromium.org,petrcermak@chromium.org,perezju@chromium.org
# Not skipping CQ checks because original CL landed more than 1 days ago.
BUG= 606698 

Review-Url: https://codereview.chromium.org/1920363003
Cr-Commit-Position: refs/heads/master@{#390126}

[modify] https://crrev.com/a2dfc3c66ad792aea2f53009c26520deb4cbfed8/tools/perf/benchmarks/memory_infra.py
[modify] https://crrev.com/a2dfc3c66ad792aea2f53009c26520deb4cbfed8/tools/perf/page_sets/memory_health_story.py

Status: Verified (was: Assigned)
Fix landed and tests are working fine this time :)
Here are the data reported by the parallel benchmark on Nexus 5: https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%282%29/builds/2652/steps/memory.top_10_mobile_tbmv2/logs/json.output

However, it seems to me that, unlike the TBMv1 memory benchmark data (e.g. https://chromeperf.appspot.com/report?sid=9396c22951b4d0c26ae0ef1b0c84c3f8b6185b60c87625fdcba5071e1a354d3c is up to date), the TBMv2 memory benchmark data is currently NOT exported to the dashboard: When I select "memory.top_10_mobile_tbmv2" as the test suite and "android-nexus5" as the bot, the third dropdown on the dashboard keeps loading for a while and then disappears (suggesting that the TBMv2 memory benchmark results were empty).

Furthermore, unlike the TBMv1 benchmark, which shows metric results on the build page (https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/2828):

      +----------------------------------------------------------------------+
  53. | memory.memory_health_quick memory.memory_health_quick                |
      | memory.memory_health_quick                                           |
      |                                                                      |
      | background@@memory_allocated_objects_partition_alloc_renderer: 1.45M |
      | background@@memory_allocated_objects_partition_alloc_total: 1.45M    |
      | background@@memory_allocated_objects_v8_renderer: 1.75M              |
      | background@@memory_allocated_objects_v8_total: 1.75M                 |
      | [...]                                                                |
      | foreground@@process_count_renderer: 1.0                              |
      | foreground@@process_count_total: 3.0                                 |
      |                                                                      |
      | Device Affinity: 3                                                   |
      +----------------------------------------------------------------------+
        * stdio
        * json.output
        * Results Dashboard

the TBMv2 benchmark does NOT show any metric results on the build page (https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%282%29/builds/2652):

      +----------------------------------------------------------------------+
  42. | memory.top_10_mobile_tbmv2 memory.top_10_mobile_tbmv2                |
      | memory.top_10_mobile_tbmv2                                           |
      |                                                                      |
      | Device Affinity: 1                                                   |
      +----------------------------------------------------------------------+
        * stdio
        * json.output
        * Results Dashboard

nednguyen, eakuefner: Is this a problem with the TBMv2 benchmark (e.g. using ":" in value names) or the dashboard?
Hmm.. that's weird, I was able to select some graphs:
https://chromeperf.appspot.com/report?sid=5762114bab0c4783f176b0997b38cee796b55a9e6da9261bbcb6a269ed2c3978

But I did notice:
- Sometimes, the dropdown to select a metric does not load (as happened to Petr)
- When it loads, auto-complete seems somewhat broken, for example typing "overall" or "pss" shows nothing, I had to almost type in its entirety "memory:chrome:all:vmstats:overall:pss_avg".
- Sometimes graphs end up with the error "Failed to fetch graph data", even if the chart was already shown.

I'm guessing this could be more of an issue that we're sending lots and lots of metrics? (for every metric now we end up sending _std, _avg, _sum, etc..)

Also not sure whether to re-open this, or file one or more new issues.
Annie: can you help take a look at the dashboard issue?
I filed https://github.com/catapult-project/catapult/issues/2286 for the dashboard problem.

Sign in to add a comment