New issue
Advanced search Search tips

Issue 646925 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Sep 2016
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug-Regression



Sign in to add a comment

TBMv2 benchmarks are failing on ref builds with clock sync error

Project Member Reported by oysteine@chromium.org, Sep 14 2016

Issue description

TraceImportError: No clock sync markers exist pairing clock domain "UNKNOWN_CHROME_LEGACY" with model clock domain "TELEMETRY".
    at Object.getModelTimeTransformerRaw_ (/tracing/model/clock_sync_manager.html:171:15)
    at Object.getModelTimeTransformer (/tracing/model/clock_sync_manager.html:151:19)
    at Object.toModelTimeFromUs_ (/tracing/extras/importer/trace_event_importer.html:3052:42)
    at Object.processCompleteEvent (/tracing/extras/importer/trace_event_importer.html:522:16)
    at Object.importEvents (/tracing/extras/importer/trace_event_importer.html:1000:28)
    at Object.runImportEventsOnOneImporter (/tracing/importer/import.html:189:26)
    at Object.timedNamedFunction (/tracing/base/timing.html:70:18)
    at Object.run (/tracing/base/task.html:71:21)
    at Function.Task.RunSynchronously (/tracing/base/task.html:152:25)
    at Object.importTraces (/tracing/importer/import.html:73:17)
[  FAILED  ] /tmp/tmplffUA5.html (977 ms)


 
Hi oysteine@,

Any chance you could link to the failure log? This looks like something that's almost certainly my fault, so I'd like to investigate.
Labels: -Pri-2 Pri-1
It looks like the Telemetry side of the clock sync marker is there:

{
            "args": {
                "issue_ts": 14600787442.741001,
                "sync_id": "55e804f8-b47f-4736-bdbd-4952c487a56f"
            },
            "category": "python",
            "name": "clock_sync",
            "ph": "c",
            "pid": 37128,
            "tid": 140735252758528,
            "ts": 14600787935.262001
        }

But the Chrome side is not. I just checked, and I'm also able to reproduce this locally.


I just verified that this does *not* happen on stable.
Okay, I think I figured it out:

There's been a bug that's been around for a while, https://github.com/catapult-project/catapult/issues/2547, that Telemetry traces couldn't clock sync with Chrome traces.

One outcome of this bug was that we discovered that Chrome was improperly parsing Telemetry's categories and, as a result, sometimes omitting metadata events (like clock sync markers). This is bad: those are always supposed to be included in the trace.

oysteine@ did some work that fixed this parsing. However, that work isn't available yet in the reference builds.

nedn@ recently fixed the other half of this bug (https://codereview.chromium.org/2334753002/), which made sure that we shifted timestamps that came in the trace event format. Because we only check that all of the clock domains are connected via clock sync markers when this shift happens, we all of a sudden started noticing, "Hey, Chrome and Telemetry aren't connected here..."

The fix here is going to be finding a way to make sure that Telemetry passes a category filter to Chrome that makes Chrome issue all metadata events, *even in the reference build*.
(Here are oysteine's changes that are making the stable build succeed: https://codereview.chromium.org/2040663002/)
Project Member

Comment 7 by bugdroid1@chromium.org, Sep 14 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/2c1a2b054e002dd463cdf88df02b3e20d550d0dd

commit 2c1a2b054e002dd463cdf88df02b3e20d550d0dd
Author: oysteine <oysteine@chromium.org>
Date: Wed Sep 14 19:08:50 2016

Disabled the v8.todomvc reference build due to missing clock sync markers

TBR=nednguyen@chromium.org
BUG= 646925 

Review-Url: https://codereview.chromium.org/2340033002
Cr-Commit-Position: refs/heads/master@{#418632}

[modify] https://crrev.com/2c1a2b054e002dd463cdf88df02b3e20d550d0dd/tools/perf/benchmarks/v8.py

So I tried just adding the "__metadata" category to the filter string, but that doesn't seem to have had an effect. If I remember correctly, the bug here is that some metadata events only get recorded when the default categories are enabled. I'm testing to verify this right now.
What the heck... even if I use the default overhead filter, (options.config.chrome_trace_config.SetDefaultOverheadFilter() in the test setup - basically using '*' as the filter string), the clock sync events aren't being recorded. I wonder what's happening?

At this point, the next step might be to sync back to the git revision associated with the reference build and add some logging statements to try and figure out what's going on.
Owner: charliea@chromium.org
Status: Assigned (was: Untriaged)
Summary: TBMv2 benchmarks are failing on ref builds with clock sync error (was: Re-enable v8.todomvc.reference once the ref builds roll over)
The real bug is for those ref build benchmark, "blink.console" wasn't enabled, hence the backcompat code path wasn't enabled. I will reland my fix with a hack to telemetry to enable blink.console.
Owner: nedngu...@google.com
I stopped the catapult roll & will work on relanding the CL with the extra hack. If we let the revert go through, it will break the trace-view of chrome on Android.
Project Member

Comment 15 by bugdroid1@chromium.org, Sep 15 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/473c74601fb0a32b0d79b3ec73f07df60918c17b

commit 473c74601fb0a32b0d79b3ec73f07df60918c17b
Author: nednguyen <nednguyen@google.com>
Date: Thu Sep 15 12:53:31 2016

Revert of Disabled the v8.todomvc reference build due to missing clock sync markers (patchset #2 id:20001 of https://codereview.chromium.org/2340033002/ )

Reason for revert:
Root issue should be fixed.

Original issue's description:
> Disabled the v8.todomvc reference build due to missing clock sync markers
>
> TBR=nednguyen@chromium.org
> BUG= 646925 
>
> Committed: https://crrev.com/2c1a2b054e002dd463cdf88df02b3e20d550d0dd
> Cr-Commit-Position: refs/heads/master@{#418632}

TBR=nednguyen@chromium.org,oysteine@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG= 646925 

Review-Url: https://codereview.chromium.org/2347493004
Cr-Commit-Position: refs/heads/master@{#418844}

[modify] https://crrev.com/473c74601fb0a32b0d79b3ec73f07df60918c17b/tools/perf/benchmarks/v8.py

Status: Fixed (was: Assigned)
Marking this as fixed (now that the revert is in)
Yep looks like that did it, thanks!
Status: Started (was: Fixed)
The ref build is still failing with clock sync error,...

https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_3-2016-09-16_00-16-33-27537.html
Project Member

Comment 19 by bugdroid1@chromium.org, Sep 16 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/1d291e8613865af7b4e1ef4398fe3a009be21af4

commit 1d291e8613865af7b4e1ef4398fe3a009be21af4
Author: catapult-deps-roller <catapult-deps-roller@chromium.org>
Date: Fri Sep 16 16:05:26 2016

Roll src/third_party/catapult/ 3d7a0942b..be85e5fd4 (1 commit).

https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/3d7a0942bcf4..be85e5fd4bf5

$ git log 3d7a0942b..be85e5fd4 --date=short --no-merges --format='%ad %ae %s'
2016-09-16 nednguyen Reenable testTBM2ForSmoke & fix clock sync error on ref build of Chrome on Mac

BUG= 646925 

TBR=catapult-sheriff@chromium.org

Review-Url: https://codereview.chromium.org/2343113003
Cr-Commit-Position: refs/heads/master@{#419179}

[modify] https://crrev.com/1d291e8613865af7b4e1ef4398fe3a009be21af4/DEPS

Status: Fixed (was: Started)

Sign in to add a comment