New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 618262 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Last visit 24 days ago
Closed: Jun 2016
Cc:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 3
Type: Bug



Sign in to add a comment

Add heartbeat trace for CategorizedWorkerPool during waiting for more tasks.

Project Member Reported by prashan...@samsung.com, Jun 8 2016

Issue description

When CategorizedWorkerPoolThread is waiting for more tasks, it looks
like it is hung as it starts working only after the related condition
variable is signaled.

Adding heartbeat trace can help detect whether thread is live or not.

Refer http://crbug.com/617010 for more details.
 
Cc: robliao@chromium.org danakj@chromium.org jam@chromium.org reve...@chromium.org
Owner: prashan...@samsung.com
Is the heartbeat necessary? In the referenced hang stack, the thread is simply idle. If this were a true hang, you'd expect to see a culprit further up the stack.

What are we hoping to accomplish with the heartbeat?
Would there come a callstack further up if the thread is hang? I may be wrong.

To identify whether thread is really hung or idle and waiting for next signal, heartbeat trace would help.
Generally yes. You wouldn't be waiting on a condition variable.

Now, if someone was supposed to signal the condition variable but didn't, then you have a problem. However, the heartbeat wouldn't find who was supposed to signal the condition variable. All you would find out is that you timeout. You would also timeout if there was no work to be done.

If a thread is waiting on a condition variable, the thread is effectively idle.

To add heartbeat trace, there is a timeout added. Heartbeat should not generally tell who has signaled the condition variable, it is meant for showing with trace that thread is idle.

If thread is waiting on a condition variable and heartbeat traces are not coming, then thread is hung.

Ideally there should be a mechanism which will detect this kind of situation like we do for GPU process crash.
The issue with a heartbeat within a thread is you can't distinguish between many tasks running or a hung task. All you know is if a thread is idle.

If you're looking to truly monitor hung threads, consider taking a similar approach to https://cs.chromium.org/chromium/src/chrome/browser/metrics/thread_watcher.h

ThreadWatcher runs on a watchdog thread and periodically posts a task to the targeted thread. If the thread doesn't respond within a certain time, then it's truly hung.
Hmm. Then ThreadWatcher stands to be better here as it uses watchdog kinda mechanism.

I'll skip the changes of CategorizedWorkerPool.* from cl https://codereview.chromium.org/2047433004/.

Should we patch up rest of the code? (TimedWait returning true when timeout occurs or false when gets signaled.)
On CV: Only if there is code that will stand to benefit from it. We've not needed it previously, so by default the current behavior is fine.
Status: WontFix (was: Untriaged)
We'll not fix this. We'll implement something like ThreadWatcher.

Sign in to add a comment