New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 683475 link

Starred by 1 user

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

Duplicate thread name in trace report

Project Member Reported by shunhsingou@chromium.org, Jan 21 2017

Issue description

In CL https://codereview.chromium.org/2450073002, scheduling-only threads is included in trace report. However, for some unknown reasons, in the code we use PID as TID when storing event. (https://codereview.chromium.org/2450073002/patch/80001/90002)

This causes issue when we expect exactly only one thread with the given name when calling process_base::findAtMostOneThreadNamed (https://cs.chromium.org/chromium/infra/appengine/third_party/catapult/tracing/tracing/model/process_base.html?q=%22findAtMostOneThreadNamed%22&l=178&dr=C)

For example we use https://docs.google.com/document/d/1y-2u5nUSo21fyarWCReFYiLRjUjWFDBN1lk_DG22aFg/edit#heading=h.19bkpwh38a9s to manually collect both traces from Chrome and Android to help debugging ARC++ performance issue. We saw error "Expected no more than one Compositor" when loading report recently.

Attachment is an example report.

Is there any reason that we use PID as TID? For solving the problem, is it OK to add a prefix to the thread name, e.g., sched-XXX, when storing scheduling only information?
 
trace_arc.json.zip
4.0 MB Download
To be more clear, the question is about the following line in https://codereview.chromium.org/2450073002/patch/80001/90002: 

var thread = process.getOrCreateThread(pid);

, where the signature of the called function should be process.getOrCreateThread(thread_id)

So we're actually using parsed pid from sched_switch event as the tid (thread_id) stored in the data structure.
This causes some unexpected issue.

For example, in the attached report in the issue, we have sched_switch event reported by kernel ftrace
ompositor-29073 (29066) [005] d..3  9224.2 sched_switch: prev_comm=Compositor prev_pid=29073 prev_prio=112 prev_state=S ==>next_comm=chrome next_pid=28356 next_prio=112

And we also have Chrome trace event in JSON format:
{"pid":29066,"tid":5,"ts":0,"ph":"M","cat":"__metadata","name":"thread_name","args":{"name":"Compositor"}},

In the end we have following data stored for process 29066 trace:

thread: {
     5: {"name": "Compositor"},
     29073: {"name": "Compositor"}
}

However, thread 29073 is not really a thread running in process 29066.

In chrome_renderer_helper.html, we expect there is only one thread name "Compositor" in Chrome trace event.

To fix this problem, can we simply add a prefix to the thread name from scheduling event? Any suggestion?

Comment 2 by jreck@google.com, Jan 23 2017

As I said in the internal bug Linux does not have a TID. Threads are identified by PIDs on Linux.

In this case it's the 29073 that's correct and the 5 that's "wrong". Or rather, 29073 is the OS identifier for the thread and 5 is some Chrome-internal identifier for the thread, thus the conflict.
Got your point. So the problem is why chrome is using internal identifier instead of using system call like gettid() or syscall(SYS_gettid);

Hi Nat,
Could you help on this or do you know who we can ask?


Cc: -charliea@chromium.org
Un-CCing myself. Feel free to re-CC me if necessary.
Cc: -nednguyen@chromium.org
Cc: l...@chromium.org
Concerning #3. I would assume that the internal identifiers are cheaper to get. But I will have to look closer.
Also the assert in tracing.js seems new in M56+ as I can load the trace in M55 browser:

While importing:
Error: Expected no more than one Compositor
    at Process.findAtMostOneThreadNamed (chrome://tracing/tracing.js:2436:7)
    at ChromeRendererHelper (chrome://tracing/tracing.js:922:403)
    at ChromeModelHelper.<anonymous> (chrome://tracing/tracing.js:944:152)
    at Array.forEach (native)
    at ChromeModelHelper (chrome://tracing/tracing.js:944:101)
    at Model.getOrCreateHelper (chrome://tracing/tracing.js:2563:67)
    at ChromeAuditor (chrome://tracing/tracing.js:4758:431)
    at Import.<anonymous> (chrome://tracing/tracing.js:1336:430)
    at Array.map (native)
    at Import.createAuditorsAndRunAnnotate (chrome://tracing/tracing.js:1336:390)
Owner: caseq@chromium.org

Comment 10 by l...@chromium.org, Jan 30 2017

Cc: -l...@chromium.org
Cc: groeck@chromium.org dtor@chromium.org
Cc: sonnyrao@chromium.org
Any update or anything we can help on this?

Reply for #8:
I think it's because CL https://codereview.chromium.org/2450073002 is merged after M56+.
Anyone looking at this? This sometimes blocks tracing on ARC++ now.
Status: Assigned (was: Untriaged)
Components: -Platform>DevTools>Performance>Tracing Speed>Tracing
Owner: ----
Status: Available (was: Assigned)

Sign in to add a comment