[Media Router] MR extension unable to recover after crash |
|||
Issue descriptionChrome Version: 68.0.3400.0 (though according to internal report, this like repros in as early as 64.0.3282.140) OS: Linux (not sure about other OSes) We have received internal report that Media Router stops working occasionally (manifesting as "device not found"), which requires either switching to another profile, or restarting the browser. From my investigation, I found that this is reproducible when the MR extension process is terminated (probably to simulate a crash, or possibly an update?), occasionally it will not reload properly, resulting in a state where the extension system thinks the extension is loaded, but the background page won't start up. What steps will reproduce the problem? 1. Start up browser with --show-component-extension-options and observe chrome://extensions 2. Open task manager 3. Open MR dialog 4. The "Extension: Chrome Media Router" process should be up. Terminate the process using task manager. 5. The MR extension entry in chrome://extensions should changed. (background_page.html gone, reload option appears). At this point you can wait a few seconds for it to reload itself, as there is a exponential backoff delay (related patch: https://codereview.chromium.org/2281323002/) It is reloaded once the background_page.html link re-appears. 6. Try steps 3-5 a few more times until devices are no longer found. Note that the background_page.html link is no longer clickable. Devlin, do you have some ideas on why the MR extension would get into this state, and/or ways to recover from it?
,
Apr 19 2018
After step 6, is there any process running for the extension? Is it inspectable from chrome://inspect? What does chrome://tracing think it is doing?
,
Apr 19 2018
Re #2: After step 6, there is no process running for the MR extension even if you open the dialog or click on the background_page.html link. There's no entry in chrome://inspect. I am not familiar enough with the extension code to know what to look for in chrome://tracing.
,
Apr 20 2018
The background page will only be opened if there is an event that the extension is registered for that fires while the extension is loaded. If that doesn't happen, the background page will (correctly) remain dormant. Are you sure there's an event firing that should wake up the event page?
,
Apr 23 2018
Devlin: The MR extension registers for events that should fire on extension load. Besides that, Media Router also uses ProcessManager to wake the event page. Even this bug repros, neither option (nor clicking on the background_page.html link - I think that typically works even in absence of events?) can bring up the background page.
,
Apr 23 2018
s/Even this bug repros/When this bug repros
,
Apr 23 2018
I have a more specific repro step now: 1) End the MR extension process. This starts the delayed reload in the extension system. 2) Before the extension reloads (check chrome://extensions), open the Cast dialog, which causes Media Router to use ProcessManager to attempt to wake the event page. Since the extension isn't loaded, this step is expected to have no effect. 3) Wait for the delayed reload to complete, the RELOAD button is replaced with background page link. However, the link doesn't work, and the extension isn't actually reloaded, and there's no way to reload it. So I suspect the MR using ProcessManager to wake the event page is somehow interfering with the delayed reload. But not sure why that would cause the extension to enter a bad state.
,
Apr 26 2018
Derek is investigating. Probably causing user failures for MR in the field in a way users cannot easily fix themselves.
,
May 11 2018
I did some investigation today and I think the bug comes down to this method in LazyBackgroundTaskQueue (https://cs.chromium.org/chromium/src/extensions/browser/lazy_background_task_queue.cc?type=cs&q=LazyBackgroundTaskQueue&sq=package:chromium&l=95): This method is called by MediaRouter when it attempts to wake up the MR extension. Here's a copy of the method with some annotations: void LazyBackgroundTaskQueue::AddPendingTask( content::BrowserContext* browser_context, const std::string& extension_id, PendingTask task) { if (ExtensionsBrowserClient::Get()->IsShuttingDown()) { std::move(task).Run(nullptr); return; } PendingTasksList* tasks_list = nullptr; PendingTasksKey key(browser_context, extension_id); PendingTasksMap::iterator it = pending_tasks_.find(key); if (it == pending_tasks_.end()) { // [0] auto tasks_list_tmp = std::make_unique<PendingTasksList>(); tasks_list = tasks_list_tmp.get(); pending_tasks_[key] = std::move(tasks_list_tmp); // [1] const Extension* extension = ExtensionRegistry::Get(browser_context)->enabled_extensions().GetByID( extension_id); if (extension && BackgroundInfo::HasLazyBackgroundPage(extension)) { // [2] // If this is the first enqueued task, and we're not waiting for the // background page to unload, ensure the background page is loaded. ProcessManager* pm = ProcessManager::Get(browser_context); pm->IncrementLazyKeepaliveCount(extension); // Creating the background host may fail, e.g. if |profile| is incognito // but the extension isn't enabled in incognito mode. if (!pm->CreateBackgroundHost( extension, BackgroundInfo::GetBackgroundURL(extension))) { std::move(task).Run(nullptr); // [3] return; } } } else { tasks_list = it->second.get(); } tasks_list->push_back(std::move(task)); } The problem seem to be that if this method is invoked while the extension crashed and is waiting to reload, then the check in [2] will fail due to |extension| being nullptr and we won't try to create a background host. However the fact that we created an entry in |pending_tasks_|, means that the next time we invoke this method (i.e. when the extension is reloaded), we won't try to create a background host again due to the check in [0]. Hitting the case in [3] also presents a similar problem, although the task is run in this case. Unless the pending tasks entry can be cleared via other means (it doesn't look like the case) it looks like the extension will be stuck in a bad state where the background host won't ever be loaded in this code path. To fix this I propose that (a) the task should be run with nullptr if the check in [2] fails, and (b) move the line at [1] to the end of the if-block. I tried it locally and it works -- after the extension is reloaded, the background host can be created successfully when MR tries to wake up the extension.
,
May 21 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/86c3f0e1a5237885af245181b12b6f5f181571ef commit 86c3f0e1a5237885af245181b12b6f5f181571ef Author: Derek Cheng <imcheng@chromium.org> Date: Mon May 21 18:14:32 2018 [Extension] Create lazy background host if there are pending requests. A pending request could get added to LazyBackgroundTaskQueue while the extension is not enabled (e.g. if it crashed and is waiting to be reloaded). This appears to be by design. In such a case we do not attempt to create the background host. The request (and subsequent requests) could become stuck if the background host does not get created. This patch changes LazyBackgroundTaskQueue to listen for extension loaded, and creates a background host if there are pending requests. This patch also makes sure we only create a PendingTasksList entry if we are enqueueing a request to it. Bug: 835017 Change-Id: Ie2aff8cfb620a1867b033c3474a3277a283fb258 Reviewed-on: https://chromium-review.googlesource.com/1058083 Reviewed-by: Devlin <rdevlin.cronin@chromium.org> Commit-Queue: Derek Cheng <imcheng@chromium.org> Cr-Commit-Position: refs/heads/master@{#560303} [modify] https://crrev.com/86c3f0e1a5237885af245181b12b6f5f181571ef/chrome/browser/media/router/event_page_request_manager.cc [modify] https://crrev.com/86c3f0e1a5237885af245181b12b6f5f181571ef/extensions/browser/lazy_background_task_queue.cc [modify] https://crrev.com/86c3f0e1a5237885af245181b12b6f5f181571ef/extensions/browser/lazy_background_task_queue.h [modify] https://crrev.com/86c3f0e1a5237885af245181b12b6f5f181571ef/extensions/browser/lazy_background_task_queue_unittest.cc
,
May 22 2018
Verified on Mac canary 68.0.3437.2
,
May 29 2018
|
|||
►
Sign in to add a comment |
|||
Comment 1 by imch...@chromium.org
, Apr 19 2018