Extension SW event dispatch: race in dispatching events to stopped worker |
||||
Issue descriptionTLDR, we use SWContext::StartWorkerForPattern to dispatch extension events to a stopped worker, however, at the point StartWorkerForPattern's callback runs, I see EmbeddedWorkerInstance::status() == RUNNING, but the worker script might not have completed running all its (synchronous) JavaScript (akin to LoadStop for WebContents). For stopped worker, extensions currently rely on the synchronous parts of SW script to have finished. At least two tests currently suffer from this race and shows flakiness: ServiceWorkerLazyBackgroundTest.EventsToStoppedExtension and ServiceWorkerLazyBackgroundTest.EventsAfterRestart ( https://crbug.com/834200 ) Details: (reference: https://cs.chromium.org/chromium/src/extensions/docs/events.md?rcl=203a8b36eb9c5d982fb140ac20e08218dd34b465&l=95, look into L1 registration specifically) Say from an extension sw.js: chrome.tabs.onCreated.addListener(...) In browser process, we'll add a lazy version of the listener (through EventRouter::AddLazyServiceWorkerEventListener, call it L2) and live version of listener (through EventRouter::AddServiceWorkerEventListener, call it L1). The live version contains info about SW (process id, thread id, version id...), but the lazy version won't, it will contain SW scope instead for example. Dispatching an event while the worker is running will use the live version of the listener. (It will still use StartWorkerForPattern, but that's irrelevant here). However once the worker is stopped, browser process will only remember L2 (as expected). Dispatching an event at this point will start the worker through ServiceWorkerContext::StartWorkerForPattern with L2's scope. In normal cases sw.js's chrome.tabs.onCreated will add L1 back again, but if it is delayed after StartWorkerPattern's callback is called, then we won't see L1 in the browser process yet, and dispatching from there would fail as the browser process doesn't have knowledge about the listener. Matt: For lazy background page events, we rely on the background page's DidLoadStop (might not be 100% exact, but that's roughly what happens). This ensures that synchronous scripts are run and L1 is registered as described above. Is there any similar concept for Service Workers that we can use? Also, doesn't vanilla Service Worker events depend on something similar? Assigning to falken@ temporarily, please assign it back to me with response :)
,
Aug 30
Is the problem that sw.js calls chrome.tabs.onCreate.addListener(...) at the top-level of the script but you're seeing it being called after RUNNING status was entered? -> Yes, that's what I'm seeing IIUC. The service worker doesn't reach RUNNING until it finished initial JavaScript execution, so it's expected that top-level JS is done. -> Thanks!, that's the guarantee I was looking for before diving deeper. re: setTimeout/Promise: -> No, addListener is called synchronously from top level JS re: IPC yes, it's an IPC, particularly ExtensionHostMsg_AddListener. The race is plausible between IPC and whatever SW code is using (mojo?). I think I'll take a stab at ensuring that first. Any hint is appreciated :)
,
Aug 30
Yep IPC ordering seems likely. The service worker OnStarted message is on a dedicated Mojo pipe: see EmbeddedWorkerInstanceClient.OnStarted in embedded_worker.mojom. There's no ordering guarantee between that message and legacy IPCs.
,
Aug 31
^^^I can confirm that I'm hitting the out-of-order message arrival in the browser process which is indeed as suspected is the root case. The worker thread sees SendWorkerStarted *after* we dispatch extension message from renderer worker thread, but the browser receives them in reverse order sometimes... Just documenting this for now, will follow up with possible "fixes" for this later. Also, /cc Devlin. Debug code and logs for reference: https://chromium-review.googlesource.com/c/chromium/src/+/1198528/1 Successful case: [201944:201944:0830/174153.817594:ERROR:service_worker_apitest.cc(1609)] Opening new tab to get message [201944:201944:0830/174153.850439:ERROR:event_router.cc(536)] DispatchEventImpl: tabs.onCreated [201944:201944:0830/174153.850488:ERROR:event_router.cc(563)] Dispatch lazy is_for_service_worker: event_name = tabs.onCreated [201944:201944:0830/174153.850588:ERROR:service_worker_task_queue.cc(156)] RunAfterStartWorker [1:14:0830/174154.032396:ERROR:ipc_message_sender.cc(185)] [WORKER] [thread_id: 14], +WorkerThreadDispatcher [1:14:0830/174154.041951:ERROR:ipc_message_sender.cc(231)] [WORKER] [thread_id: 14], AddListener = tabs.onCreated <------------------ [a] [201944:201944:0830/174154.042330:ERROR:extension_message_filter.cc(180)] [UI] ExtensionMessageFilter AddListener: tabs.onCreated <---- [a'] [201944:201944:0830/174154.042365:ERROR:event_router.cc(195)] ER::AddServiceWorkerEventListener: tabs.onCreated [1:14:0830/174154.042333:ERROR:service_worker_context_client.cc(1548)] [WORKER] [thread_id: 14], ServiceWorkerContextClient::SendWorkerStarted <-------- [b] [201944:203022:0830/174154.042927:ERROR:embedded_worker_instance.cc(843)] EmbeddedWorkerInstance::OnStarted[IO] [Will be RUNNING], thread_id = 14 <---- [b'] [201944:203022:0830/174154.043215:ERROR:service_worker_context_wrapper.cc(97)] DidStatWorker, instance->status() = 2 [201944:203022:0830/174154.043236:ERROR:service_worker_task_queue.cc(44)] [IO] DidStartWorkerForPattern: gjfaagkmdpfpdbaajdccmodbajagiica [201944:201944:0830/174154.043291:ERROR:service_worker_task_queue.cc(31)] [UI] DidStartWorkerForPattern->Running Task: RunTask <------------------------[b''] [201944:201944:0830/174154.043331:ERROR:event_router.cc(826)] [UI] DispatchPendingEvent: tabs.onCreated, worker_thread_id = 14 [201944:201944:0830/174154.043377:ERROR:event_router.cc(91)] DispatchExtensionMessage: event_name = tabs.onCreated [201944:201944:0830/174154.049457:ERROR:service_worker_apitest.cc(1614)] Done out-of-order/failure case: [55734:55734:0830/174530.254853:ERROR:service_worker_apitest.cc(1609)] Opening new tab to get message [55734:55734:0830/174530.281004:ERROR:event_router.cc(536)] DispatchEventImpl: tabs.onCreated [55734:55734:0830/174530.281042:ERROR:event_router.cc(563)] Dispatch lazy is_for_service_worker: event_name = tabs.onCreated [55734:55734:0830/174530.281112:ERROR:service_worker_task_queue.cc(156)] RunAfterStartWorker [1:14:0830/174530.429605:ERROR:ipc_message_sender.cc(185)] [WORKER] [thread_id: 14], +WorkerThreadDispatcher [1:14:0830/174530.473726:ERROR:ipc_message_sender.cc(231)] [WORKER] [thread_id: 14], AddListener = tabs.onCreated <---------------------------------- [a] [1:14:0830/174530.474039:ERROR:service_worker_context_client.cc(1548)] [WORKER] [thread_id: 14], ServiceWorkerContextClient::SendWorkerStarted <--------- [b] [55734:56129:0830/174530.474480:ERROR:embedded_worker_instance.cc(843)] EmbeddedWorkerInstance::OnStarted[IO] [Will be RUNNING], thread_id = 14 <------- [b'] [55734:56129:0830/174530.474735:ERROR:service_worker_context_wrapper.cc(97)] DidStatWorker, instance->status() = 2 [55734:56129:0830/174530.474754:ERROR:service_worker_task_queue.cc(44)] [IO] DidStartWorkerForPattern: gjfaagkmdpfpdbaajdccmodbajagiica [55734:55734:0830/174530.475504:ERROR:service_worker_task_queue.cc(31)] [UI] DidStartWorkerForPattern->Running Task: RunTask <---------------------- [b''] [55734:55734:0830/174530.475532:ERROR:event_router.cc(826)] [UI] DispatchPendingEvent: tabs.onCreated, worker_thread_id = 14 [55734:55734:0830/174530.475544:ERROR:event_router.cc(840)] DispatchPendingEvent: NO PROCESS FOUND TO DISPATCH, IMMINENT FAILURE AHEAD [55734:55734:0830/174530.475552:ERROR:event_router.cc(841)] DispatchPendingEvent: NO PROCESS FOUND TO DISPATCH, IMMINENT FAILURE AHEAD [55734:55734:0830/174530.475560:ERROR:event_router.cc(842)] DispatchPendingEvent: NO PROCESS FOUND TO DISPATCH, IMMINENT FAILURE AHEAD [55734:55734:0830/174530.475608:ERROR:extension_message_filter.cc(180)] [UI] ExtensionMessageFilter AddListener: tabs.onCreated <---------------------- [a'] [55734:55734:0830/174530.475619:ERROR:event_router.cc(195)] ER::AddServiceWorkerEventListener: tabs.onCreated
,
Sep 13
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b8e24bdbf6bbe43bc32fd684f2a0aad914e804d2 commit b8e24bdbf6bbe43bc32fd684f2a0aad914e804d2 Author: Istiaque Ahmed <lazyboy@chromium.org> Date: Thu Sep 13 15:17:25 2018 [Extensions SW] Fix Event registration race. There's no ordering guarantee between SW mojo IPCs and extension IPCs. This (infrequently) causes browser process to see event listener registration *after* the browser process decides to dispatch that event. This CL adds extension IPCs to notify browser process when: - A service worker context has finished loading and executing its global JS scope's code (this corresponds to EmbeddedWorkerInstance::OnStarted) - A service worker context has destroyed. With these IPCs, browser process can identify which extenison SWs are running at a given point, and ServiceWorkerTaskQueue can queue up tasks to dispatch events when a worker hasn't completely started, e.g. when the worker has started from SW mojo IPC's perspective, but it hasn't yet finished executing its global JS code from legacy IPC's perspective. Bug: 879015 Test: EventsToStoppedWorker test doesn't flake locally anymore. Change-Id: I390e102a55acb97d680265d3362d7e01d72aea1b Reviewed-on: https://chromium-review.googlesource.com/1203319 Commit-Queue: Istiaque Ahmed <lazyboy@chromium.org> Reviewed-by: Lei Zhang <thestig@chromium.org> Reviewed-by: Kinuko Yasuda <kinuko@chromium.org> Reviewed-by: Devlin <rdevlin.cronin@chromium.org> Reviewed-by: Makoto Shimazu <shimazu@chromium.org> Cr-Commit-Position: refs/heads/master@{#591014} [modify] https://crrev.com/b8e24bdbf6bbe43bc32fd684f2a0aad914e804d2/chrome/browser/extensions/service_worker_apitest.cc [modify] https://crrev.com/b8e24bdbf6bbe43bc32fd684f2a0aad914e804d2/chrome/renderer/chrome_content_renderer_client.cc [modify] https://crrev.com/b8e24bdbf6bbe43bc32fd684f2a0aad914e804d2/chrome/renderer/chrome_content_renderer_client.h [modify] https://crrev.com/b8e24bdbf6bbe43bc32fd684f2a0aad914e804d2/content/public/renderer/content_renderer_client.h [modify] https://crrev.com/b8e24bdbf6bbe43bc32fd684f2a0aad914e804d2/content/renderer/service_worker/service_worker_context_client.cc [modify] https://crrev.com/b8e24bdbf6bbe43bc32fd684f2a0aad914e804d2/extensions/browser/extension_service_worker_message_filter.cc [modify] https://crrev.com/b8e24bdbf6bbe43bc32fd684f2a0aad914e804d2/extensions/browser/extension_service_worker_message_filter.h [modify] https://crrev.com/b8e24bdbf6bbe43bc32fd684f2a0aad914e804d2/extensions/browser/service_worker_task_queue.cc [modify] https://crrev.com/b8e24bdbf6bbe43bc32fd684f2a0aad914e804d2/extensions/browser/service_worker_task_queue.h [modify] https://crrev.com/b8e24bdbf6bbe43bc32fd684f2a0aad914e804d2/extensions/common/extension_messages.h [modify] https://crrev.com/b8e24bdbf6bbe43bc32fd684f2a0aad914e804d2/extensions/renderer/dispatcher.cc [modify] https://crrev.com/b8e24bdbf6bbe43bc32fd684f2a0aad914e804d2/extensions/renderer/dispatcher.h [modify] https://crrev.com/b8e24bdbf6bbe43bc32fd684f2a0aad914e804d2/extensions/renderer/worker_thread_dispatcher.cc [modify] https://crrev.com/b8e24bdbf6bbe43bc32fd684f2a0aad914e804d2/extensions/renderer/worker_thread_dispatcher.h
,
Sep 13
|
||||
►
Sign in to add a comment |
||||
Comment 1 by falken@chromium.org
, Aug 30