cc::ProxyMain uses CONTINUE_ON_SHUTDOWN TaskRunner, but depends on it running all tasks. |
|||
Issue description
Inside ImageController::StopWorkerTasks() we synchronize with the worker thread, actually stopping the caller's execution until a task posted to the worker sequence has completed:
// Post a task that will simply signal a completion event to ensure that we
// "flush" any scheduled tasks (they will abort).
CompletionEvent completion_event;
worker_task_runner_->PostTask(
FROM_HERE, base::BindOnce([](CompletionEvent* event) { event->Signal(); },
base::Unretained(&completion_event)));
completion_event.Wait();
However, the task runner to which this is posted is created with CONTINUE_ON_SHUTDOWN semantics:
// The image worker thread needs to allow waiting since it makes discardable
// shared memory allocations which need to make synchronous calls to the
// IO thread.
params.image_worker_task_runner = base::CreateSequencedTaskRunnerWithTraits(
{base::WithBaseSyncPrimitives(), base::TaskPriority::BACKGROUND,
base::TaskShutdownBehavior::CONTINUE_ON_SHUTDOWN});
meaning that during shutdown the posted task may never execute.
This may lead to renderer hangs during orderly shutdown.
,
Aug 30 2017
What is the order of destruction when it comes to destroying the task runner? I take it that we're talking about an external shutdown called on some loop that is not exposed here? Is it fine to change the behavior to BLOCK_SHUTDOWN and then post a (quick) task to flush this at shutdown? Ie, presumably this would be posting to the worker task after Shutdown has been initiated. Alternatively, we can switch this to SKIP_ON_SHUTDOWN and not do anything in the ImageController dtor, since we only really care that we don't crash in the current task. In either case, the task may allocate shared discardable which goes to another process. I'm not sure if that's opening up the possibility to another deadlock in SKIP_ON_SHUTDOWN case (or BLOCK_SHUTDOWN for that matter).
,
Aug 30 2017
(could you also provide a sample crash report, so that I could monitor changes to it)
,
Aug 31 2017
Is ImageController::StopWorkerTasks() called before after the main MessageLoop exits? TaskScheduler shutdown happens after the main MessageLoop exits and thus shutdown behavior shouldn't matter if ImageController::StopWorkerTasks() is only called when the main MessageLoop is running. WaitableEvents are a common source of hangs. Would it be feasible to not access |this| from worker tasks (you could bind an ImageDecodeRequestId/ImageDecodeRequest to image decode tasks instead of accessing |image_decode_queue_| from the task)? Then you wouldn't need to wait on a WaitableEvent in ImageController::StopWorkerTasks().
,
Aug 31 2017
StopWorkerTasks() is called in a few situations where we switch the image cache. The only case that I think is in question is when we're actually destroying things. This is called when LayerTreeHost is destroyed, which is when we call things like LocalFrame::Detach. I've traced it as far as I could, but essentially this is the compositor object being deleted. I'm unsure whether the main message loop would have already exited at this point. What is the order of cleanup when the renderer goes away? As for your other question, unfortunately the task itself is also accessing a cache which would be destroyed here. I don't really see an easy way to make this whole task be standalone (ie without any state). To reiterate, I'd like to find out what the order of shutdown is since that would at least make it obvious that this is the code responsible and maybe we can do cleanup in a more coarse way if we're sure we don't need anything anymore since we're shutting down.
,
Aug 31 2017
Re #5: FWIW I'm not sure whether we're hitting this issue in practice, but while looking at renderer hang reports I did find a load where we were blocked on these events, which got me wondering whether the might simply never return due to the relevant tasks not actually being run any more, due to the change in shutdown semantics.
,
Dec 1 2017
> I'm unsure whether the main message loop would have already exited at this > point. What is the order of cleanup when the renderer goes away? Renderer shutdown just exit()s now, so this is either the ui compositor, or destroying a compositor without exiting the process (closing a tab e.g.). The ui compositor is shutdown before the main loop exits I believe. BrowserMainLoop controls this all if you want to dig into that more. If its a renderer compositor closing, the main loop isn't necessarily being destroyed at all. It sounds to me like it should be BLOCK_ON_SHUTDOWN, but also that this isn't happening cuz we're doing this from the main message loop and the other task runner is idle when shut down. But you can confirm that with BrowserMainLoop I think.
,
Dec 1 2017
Issue 760036 has been merged into this issue.
,
Dec 1 2017
This is definitely not ui compositor because the hang is ProxyMain, which is threaded compositor, and if from ImageController::StopWorkerTasks, we only have an image worker in the renderer compositor. I'm not completely sure on what the destruction order is in the renderer. We initiate TaskScheduler shutdown in ~ChildProcess (https://cs.chromium.org/chromium/src/content/child/child_process.cc?sq=package:chromium&l=93), would this happen before or after the compositor thread teardown?
,
Feb 26 2018
Is this issue P1? I'm changing to P2 but please feel free to set it back if this warrants P1. |
|||
►
Sign in to add a comment |
|||
Comment 1 by schenney@chromium.org
, Aug 30 2017